


Institutional Archive of the Naval Postgraduate School 





Calhoun: The NPS Institutional Archive 
DSpace Repository 


Theses and Dissertations 1. Thesis and Dissertation Collection, all items 


1981 


Condition recognition for a program synthesizer 


Miller, Charles Wayne; Lape, Joseph Shawn 


http://ndl.handle.net/10945/20468 


Downloaded from NPS Archive: Calhoun 


| Calhoun is the Naval Postgraduate School's public access digital repository for 
D U DLEY research materials and institutional publications created by the NPS community. 
sa Calhoun is named for Professor of Mathematics Guy K. Calhoun, NPS'‘s first 
KNOX appointed — and published — scholarly author. 


LIBRARY Dudley Knox Library / Naval Postgraduate School 
411 Dyer Road / 1 University Circle 
Monterey, California USA 93943 





http://www.nps.edu/library 


CONDITION RECOGNITION FOR A PROGRAM 
OYNTHESIZER 


Charles Wayne Miller 








NAVAL POSTGRADUATE SCHOOL 


Monterey, California 





THESIS 


CONDITION RECOGNITION FOR A 
PROGRAM SYNTHESIZER 


by 


Joseph Shawn Lape 
and 


Charles Wayne Miller 


June 1981 


Thesis Advisor: Douglas R. Smith 





Approved for public release; distribution unlimited 





SECURITY CLASSIFICATION OF THIS PAGE (When Data Entered) 


REPORT DOCUMENTATION PAGE 


2. GOVT ACCESSION NOJ 3. RECIPIENT'S CATALOG NUMBER 










Master's Thesis: 
June 1981 


5S. PERFORMING ORG. REPORT NUMBER 





Condition Recognition for a Program Synthesizer 











7. AUTHOR(e) 8. CONTRACT OR GRANT NUMBER(@) 






Joseph Shawn Lape 
Cahrles Wayne Miller 







10. PROGRAM ELEMENT. PROJECT, TASK 
AREA & WORK UNIT NUMBERS 





9. PERFORMING ORGANIZATION NAME ANO ADORESS 






Naval Postgraduate School 
Monterey, California 93940 













tl. CONTROLLING OFFICE NAME ANDO ADORESS 12. REPORT OATE 







Naval Postgraduate School June 1981 
Monterey, California 93940 ue OF PAGES 






18 SECURITY CLASS. (ol thie report) 





MONITORING AGENCY NAME &@ ADDRESS(if different from Controlling Office) 










Unclassified 


Se. OECLASSIFICATION/ DOWNGRADING 
SCHEOULE 


Approved for public release, distribution unlimited 





16. OISTRIBUTION STATEMENT (of thie Repert) 


17. OISTRIBUTION STATEMENT (of the sbetract entered in Blook 20, if different from Report) 


15. SUPPLEMENTARY NOTES 


19. KEY WORDS (Continue on reveree side i{ neceesary and identify by block number) 


program synthesis, automatic programming, conditions, example computation, 
Static processing, dynamic processing, miniterms, character set hierarchy, 
condition recognition 


20. ABSTRACT (Continue on reveree side if neceseary and identify by bleck number) 


An enumeration algorithm which synthesizes programs from example 
computations is presented. The algorithm, originally proposed by Alan W. 
Biermann of Duke University, assigns a labelling of the instructions contained 
in an example trace consistent with producing minimum state Moore machine 
representations for the synthesized programs. Techniques for processing the 
information to reduce enumeration are given. Biermann’s algorithm is 
extended by trace preprocessing techniques which identify and generalize 





DD on 73 1473 Er TION OF | Nov 58 15 OBSOLETE 


(Page 1) ate : ai SECURITY CLASSIFICATION OF THIS PAGE (When Date Entered) 








SOCUMTY CL ABSIFIC ATION OF TwIs PAGE/Wren Nore Entered 
conditions on instruction sequencin 
user's assistance. The techniques 

domain, but are general enough to b 


g in the synthesized programs without the 
are presented using text editing as the 
e extendable into other domains. 


DD Form. 1473 


s/N O1b32014-6601 SECUMITY CLASSIFICATION OF TWIS PAGESBRER Dota Eniored) 





Moe GOved= OG pubic rebease, distritution unlimited. 


CONDITION RECOGNITION FOR A PROGRAM SYNTHESIZER 


DY 
Dae away eel ee 
Captain, United States 'Marin® Corns 
Pewee Vande rot! | University, Locc 
and 
Josepn Shawn Laps 


Captain, United States Marine Corps 
Poon vensily. OF LOUulSVvIIle, 1975 


Suomi vleds 1B partial fultiliment of tne 
GeGulrenents 20r the deeres of 


MASTER OF SCIENCE IN COMPUTER SCIENCE 


PFon eae 


NAVAL POSTGRADUATES SCHOOL 
June 1981 


0 tid AS 


? =< cos 





ABSTRACT 


An enumeration algorithm whicn synthesizes programs from 
~eampre COoNmputations is presented. Tne algorithm, originally 
proposed by Alan W. Biermann ot Duke University, assigns a 
Migeming Of the instructions contained in an €xample trace 
moist Sten t witn producing Minimum state Moore macnine 
momesentations for tne syntnesized programs. Tecnnigues tor 
processing tne information to reduce enumeration are given. 
Biermanno’s algoritnm is extended by trace preprocessing 
techniques which identify and generalize conditions on 
instruction sequencing in the synthesized programs witnout 
tne user’s assistance. Tne tecnniques are presented using 
text editing as tne domain, but are general enougn to be 


extendable into other domains. 
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Pe CN OUU CIEL ON 


A. BACKGROUND 

peeuce the introduction of electronic computine machines, 
manual taskS tnat are mundane, tedious and/or repetitious 
mevyve peen considered for automation. The computer is idaeally 
Simeeed Lor this type work since it neither complains of 
Domeqom nor wanders tron its assigned task. Tne nacnine 
meticulously Seauences through a series ot coOmputations over 
ana Over, producing answers consistent Withioa tne 
limitations of the nardware. AS consistent as the computer 
Hs sat performing tasks, assigning tne tasks is still left to 
the user of the system. 

Progranning tne early mnacnines was a difficult chore. 
Conmunications between man and machine were only 
accomplishable tnroughn tne language of tne macnine. This 
macnine Language consisted of binery coded. nacnine 
operations. Tne efficient macnine language programmer nad to 
menorize tnese codes or xeep a list of tne codes close oy. 
Mieco ntrol transfer points had to be coded in abdsolute 
machine adcresses wnich tne programmer calculated by nana. A 
programmmer nad to interpret the binary frepresentation of 
the machine operations to determine th2 cause of errors in 
programs. There were no diagnostic messages to aid tne user 


mieetsoOlating errors. The difficulty of progremming in 
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macnine language led to a searcn to tind better ways of 
menerdting programs. [Tne first step was the recognition tnat 
ume computer was a e002 bookkeeper, capable ot computing 
absolute addresses from labels and translating mnemonic 
representations of macnine operation codes. Wetster’s New 
Word Dictionary, Second Edition, defines mnemonic to be, ‘a 
system or tecnnique of improving memory by the use of 
certain formulas. Soon programs were written wnicn would 
accept abstract programs containing mnemonics and labels, 
Same rt the mnemonics into macnine operation codes and 
translate tne labels into absolute macnine addresses. These 
Drograns produced executable mnacnine language code as 
output. These translation programs were called assemblers 
and tne data tney translated were called assembly language 
prograns. 

Assembly language provided some automation of the manual 
tasks associated with machine language programming. An 
important convenience Of assembly language is tne 
readavility of the programs when compared to macnine 
language programs. Tne mnenomics convey tne meaning of their 
miemmmon while tne labels relieved tné@ progranner of 
Calculating absolute addresses for control transfer points. 
Persemony lamweunage provided a level of abdstraction which 
allowed programmers to concentrate on tne programming 
Spoorem without dealing with every atomic machine operation. 


The assembler provided bookkeeping, address translation and 
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mneumonic decoding fast and efficientiy. Programmers were 


now capable of producing more code in less time witn fewer 
errors with assembly language. 

Assembly language eased tne programmers task but it 
feet COULA not be considered a panacea for computer-human 
mmuemdetlon. Assemoly language still required tne programmer 
to maintain control over many machine operations and he nad 
momeorovide the logic to control tne flow of program 
execution. The instructions used to pertorm control 
functions appears as similar code fragments in most programs 
written in assembly language. These code fragments performed 
fuctions such as controlling brancning decisions and Keeping 
count of loop indices. When it was observed that common code 
fragments appeared across a wide range of assembly programs, 
it was recognized tnat tnese code fragments could be 
represented as a single instruction and the computer could 
translate tne single instruction into tn2 code tragment it 
represented. The programs that translate these complex 
Misr ruetions are called compilers or interpeters. The 
complied or interpeted languages tnat followed assembly 
maneuasze in this evolutionary process incorporated the 
mroeran fragments aS a Single instruction tor tne language. 
Constructs such as FOR, DO WHILE and IF THEN are examples cf 
meee rebeyel control structure implementation. 

FORTRAN was the first in a lone line ofr higher level 


languages. FORTRAN differed from tne others by becoming 
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endeared to a tamily ofr users and the language endures today 
ag oné€ of tne nost rreqguently used higner level languages. 
What qualities of tne language producea this popularity? 

Tne FORTRAN language is attributed to Jonn Backus. His 
prinary goal wnen designing tne language was t9 make tne 
language resemble the notation usé€d in nign scnool algenpra. 
meee Loe notation used in nhign scnool aleebra was familiar 
to a wide audience, FORTRAN gave a triendly appearance. The 
language’ s apparent sinplicity is tne endearing quality of 
FORTRAN. Some ether language implementors failed to 
recognize tnis point and their languages never received wide 
meeeprance. ALsOL 1s an example of a powerful langvage that 
never received the acceptance anticipated. 

Otner programming languages that followed added compact 
representation of otner recurring program fragments. fne 
higner level constructs were not limited to control 
Brac tures but also included GONRSTEUGCTLS 340) 1G data 
manipulation functions. Iverson’s [1] APL (A Programming 
Language) provided powerful operators capable of performing 
Somes functions such aS matrix multiplication in one 
mosieruction. 

This trend continues today. Mary of the newer languages 
impLrement sophisticated and powertul operators and control 
structures. Some of these languages are for a select segment 
Semconmputer users, intended tor application to a particular 


domain. The users are expected to be familiar with the 
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monain, so tne torm of tne language should te familiar to 
the user also. A problem witn a domain svecific language is 
memo ability to adapt to otner areas. To work in another 
area the user must become familiar with another language. A 
phenomenon demonstrated by many computer users 1s a 
reluctance to adapt themselves and learn a new language tnat 
may be more appropriate for a given task. Hither they break 
moemere with a sledee hammer or dig tne well witn 4a spoon. 
When required to use a new language, the user will likely 
use only a small subset of tne language tnat is capable of 
meme the job. Worst than using only a subset of tne 
language features is the tendency to bring old programming 
fees applicable to tne old language into tne new language. 
The point that iS to be made is that learning a new 
progranning language is a nard cnore and is avoided wnenever 
possible. 

mootmer direction which tne automation of programming 
tasks nas taken is the development of a programming 
Cnvironment. A programming environment automates sone of tne 
manual chores ty providing the user with aids that 4éS5S5iSt 
hin in constructing orograms. The environment includes a 
programning language, an interactive Ssyntax-directed editor 
and an on-line debugger. Tne editor provides syntax error 
G@teaenostics whil® the programmer is creating tne source 
fees ine programmer is forced to correct the Syntax error 


imnediately before tne editor will allow him to continue 
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programming. The error should be readily apparent to tne 
pee ranmer because it is in tne latest Input. The on-line 
debugger allows the programmer to actively test hniS proeran, 
halt execution, cneck the value of variapnles, Change the 
fomeue Of variables or change tne cote itself. Program 
environment systems may even allow the programmer to switcn 
foomeune tne editor to tne on-line debueger and beck at ary 
tine. A programming environment can be summarized as a 
iewmenaiy interface utilizing an intelligent editor which can 
Beccoenize Syntax errors in the associated programming 
language and one that contains otner interactive prcgramming 
Goo 2S. 

Progranming nas been called an art form requiring 
intellectual creativity. Tne automation of intellectual 
behavior is a field of study within Computer Science called 
Artificial Intelligences. Tne stucy of the automation of 
programming GaAckS sewolce srequrre numgdn-like reasoning is 
called Program Syntnesis or Autonatic Programmine. It is net 
wipe tention to provide a dé@tinition of intelligent 
benavior 2o-r a macnine since tnere is considerable 
disagreement even among tne experts. However, we note that 
MmeemeeOd) of résé€arch in autotatic programming is tne sane 
Boal that led to all the advances in programmine languages. 
Informally, tnis goal is to make tne interaction tetween man 
and computer as painless as possible. That iS, painless for 


tne man but not necessarily for tne computer. Dijkstra [2] 
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objects tO our automation ot programming by claiming, ‘de 
should not automate progranming even it we can, on5ecause it 
would take away our enjoyment ot the task. ‘We note tnere 


are those wno may require tne use of computer services tnat 
momememet tner the time nor inclination to obtain the required 
Bemeasion to do that chore. Tnese include professions such 
as lawyers, pnysicians, and even theoretical pnvsicists. We 
assume, ie vrogramming becomes fully automated, the 
programmers will then turn tneir attention toware other 
creative and stimulating pursuits. R. Hamming nas said, ‘The 
purpose of computing is insight not numbers. 

fWeny Oh=s0ing ~e€tforts are aimed at providing better 
Systems for the user so he may create programs faster, with 
meeroeer rors ani witn less eftort. The nistory of programming 
language development has snown that automation of many 
programming tasks is feasible. How mucn more of the 
programming tasks Can be automated? What would te considered 


the ultimate system for producing computer programs? 


B. AUTOMATIC PROGRAMMING 
1. General 
Program synthesis or automatic programming 1S 2 
research topic concerned witn tne development of systems 
that provide more and more automation of the progrémring 
process, particularly tnose tasks requiring human-like 
measoning. The godl is not to create systems that program 


themselves, but to create systems which can construct, under 
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Prewadirection of a user, proerans that can perrorm some 
function ne desires. Tnese systems must be easy to use, easy 
to learn, and increase the etriciency ofr tne user. The users 
of these systems will no Longer be restricted to tne few 
eomeurer professionals, but will include other professional 
fields as well as non~professionals. Automatic prcgrammine 
Svecens ar@ to Pceract witn gle WSer, recoenr ze 
reyuirements, and then syntnesize a correct program that 
Satisfies tne requirements. 

Two yuestions arise in tne resefarcn on automatic 
meoenanning. First, wnat is tne form of the interaction 
between the user and the system? This question is called tne 
specification problem because it iS concerned with issues 
Bewating to how tne user 15 to intorm tne system of nis 
reyuirements. The second question is, given a specification 
Metood, what syntnesis technique is available to be applied 
ieee wiil trensform the specitication into an appropriate 
maoenan, Ine teconique used for syntnesis is often dependent 
upon the form of the proolem specification and most of tne 
projects involving automatic programming consider botn 
problems together. It nas been proposed by Green [3S] that 
meeeeetwo Qguestions snould be separated with research 
peoceedinge concurrently on both problems. He proposes tnere 
ls a standard intermediate representation of tne probdlem 
specification which would permit interaction between tne two 


problens. 
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Four tecnniyues nave peen proposec for tne 
miecrtyCation protolem which dominate tne literature on 
automatic programming. Sacn of tne proposed tecnniyues or 
meee m Speciticdtion intreduce a different approach to the 
Synthesis problem. The four specification tecnniyues can be 
categorized as follows: 

1. Natural Laneuace. 

eee Formal Provlemespeci fication. 

Se putSOULDUL ralrs.. 

a ee xanmple Computations. 
Back of these specification tecnniques will be dicussed in 
tne following Ssuosectmons dnd tne relationsnip to a 
Synthesis approach will be discussed. 

eee -rOblem Specification with Natural Language 

A visionary approach to the specification problem is 
the use of natural language. Natural language provides a 
mie CONPTOrtable method of conmunication wnicn is already 
understood by numans. Implementation of 2 natural language 
understanding system nas proven to DED a Very Cit fiemit 
problem (Glass [4]). 

Two forms ot natural language are tne spoxen form 
and the written form. Understanding spoken language 
increases tne degree of difficulty because tne communication 
is in the form ot audio waves. Once the audio input 15s 
Suorvured, it must be converted into another form for furtner 


Syatactic and Semantic analysis. The reader will note tnat 
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once the audio input has been capturec ana converted tne 
meeolen Of Written and spoken léenguaze pecomes the same. 
That is, tne internal representation of the spoken and 
writt@n word Can be tne same and tne problem becomes one of 
inferring neaning from tne representation. Future advances 
Mmapeevoilce undé@rstanding nardware can be expected and tnese 
meeences Nay h& expected to find thelr way into use. 

MeGompreteweNdtteds  ldnsuaze wunderstandineg system 
mound ve expected to be able to understand all grammatically 
correct s@ntences. However, natural languages cdo net nave 
finite grammars. This complexity implies a CorplLece 
understanding system cannot ode implemented. However, a 
Paewen capable of understanding a subset of natural language 
can prove useful in specific domains. Early examples ofr 
programming tnrougn natural language dialogue is presented 
in a survey by Feidorn [5]. Current work on underSstandine 
matural language may be found in Biermann [5], anc Walker 
ee) 

In conclusion natural language understanding is a 
Mesa cult prodlem that can be solved only in limited 
donains. Tne use of natural Language in programming has been 
Shown to be possible by Heidorn [5], and by Biermann [6] in 
Jimited domains. The systens developed up to today nave befn 
experimental systems and tne results Wall -aajé- en 


understanding tne problem. Natural language programming 


systens will not be available for industry for at least a 
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@ecade. Finally, we present the example Fiermann [6] 
describes as a natural language specification for a problem. 
jess CXanple is @uoted from mis paper on natural langvaee 
meootamming. Its intent is to give a feel for programmine in 
Peeoural language. Tnhis eExrample does rot specify tre 
algorithm that is t0 be used althouven a natural language 
mecoecramming system wolld be capable of accepting such a 
Seecitication. 
“Woen 1 ask for a status report on a 
doctorial Student, give me nis or her year 
ah) SE ScvOol. Source and “amount of 
finencial support, and wnhich core #xXams 
have been passed. [f the student has begun 
Sete stomeanve Ne The sddvisor dnd tnesis 
CO pIC . : 
Oe pat) 00 
Mie oscoud stecnmigue is formal specification of the 
prooLlem. AS the name implies. the input is in @ more rigid 
structure than natural language. Tnis tecnnique allows tne 
user to convey tne benavior ne desires tne syvntnesized 
progran to nave witnout specifying the algorithm that is to 
be used. Snitn [8] gives tne following iefinition for tre 
POrm Of a formal specification of a problem A. 
"A(x) = 2 such that z cS & P(z,x) wnere xcD& 
I(x) where D and S are the input and output data 
Mm pess respectively, and I and P aré tne input and 
output conditions respectively. 
Moeeexample of a formal proplenm specification for a program 


tO compute the intSeer square root of a nonnegative integer 


n nay be found in Manna and Waldinger [9]. 
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Coon == sD 2 SUCH THAT 
imuesent2) &7°"*2 =C n < (7 * 1) *¥ 2 
WEERS integer({(n) + ¢ =< n 
iecre above Gxamplie n is an element of the input data type, 
fees an element of the output data type. sqrt is the problem 
Mane, integer(n) & 8 =< n iS tne input condition, and 
meme eer(z) x z2*¥2 =< n < (z+1) ** 2 is tn© output condition. 

MOEMNdh sohOoULeMesSpDeCliT ication and £tts2apolication- “to 
moemmOrOorTran synthesis protlem can best be explained tnhrouen 
examination of the work by Manna and wWaldineer [9], Manna 
and Waldinger {18], and Snitn [8]. Altnougn all ot tne work 
2S Similar in that the formal specification iS changed into 
ame appropriate program ody some form of rewrite. It is 
mweudele tO differentiate the approaches by their rewriting 
metnods. 

Tne first example is the system of Manna and 
Waldinger [9]. Tneir systen, called a deductive appro4acn, 
moments the formal specification into a progran in SotTe 
target language. Tneir approacn, combines tecnniaques of 
martication, Mav@ematical “induction. and ( transtformatioa 
rules into a single systen. The Yollowing is an  orief 
SEmolanation of this conversion. 

A Structure 1s needed to contain initial and 
intermediate results ofr the conversion DEOCes Ss. TOs 
peeecture is call a sequent. The seyuent is a tableau 
memodining two lists. Tne first list is a list or assertions 


and the second list is a list of goals. kacn element in 
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peeggder i1i1St mday Nave dn Output expression assoclatedc witna 
it. Figure 1 represents a seyuent as a table. Bach row in 
moe table nay contain Gither an asse€rtion or a goal bdut not 
meen. Fisure 1 is tne initial sequent for the integer square 
meoe problem given above. Tne input condition has been 
mmeeo in the assertion list and the output condition placed 
mametne £O0dl list. Tne output variable is associated with tne 
mmmouc condition in the output expresssion column. Tnis 
initiation action assumes the input condition is true and a 
meme n 15 attempted for tne truth of tne goal or output 


eoncition. 


sart(n) <== FIND z SUCH THAT 
integer(z) and z¥**2 =< n 
aniaene<.(2+1)° 4% 2 
WHERE integer(n) and ¢ =< a 


| Assertions ; Goals L Cutput 
| | San cn 
men teeer( a) | 
ani | 
| @ <n | | | 
: ' integer(z) | 
and 
1 2**2 =< yn | Z | 
and | 
lee 2, ny | 


Figure 1. Initialized Sequent tor the Square Root Problem 
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mer ne this s@€arch if the seuuent ever contains a row where 
the assertion can be trivially snown to be talse or the 20841 
meee FLO OF true and if the output expression for tnat row 
contains Only primitives trom tne target laneuage then tne 
Output expression is taken as tne desired syntnesized 
progam. 

Once tne tableau is initialized, tne system's 
deductive rules are applied to tne assertions and voals. The 
meourecation of these rules will cause tne creaticn of new 
mesertvions and foals and associated output expressions. Tne 
rules may then be applied to the new goals and assertions 
meee the condition for a program is satisfied. Tae 
application of the rules cnange th entries in tne tableau 
witnout cnanging tne meaning of tne tableau. We recommend 
that the interested reader review tne original work for a 
description of tne rules and their application. 

iiemavenecrlron Ot Unis Lheorem—-proyvinge technique 1s 
that tne resulting program can be preven correct by the sare 
meses used to create it. Currently tnere 15 not a running 
implementation of this technique. One of the implementation 
questions is determining wnat rule to apply at eCacn step in 
the synthesis process. This problem can be viewed as a 
Search through all possible sequences of rule applications. 
fe Search space may btecone astronomical for any relatively 
complex program since it may require hundreds of rule 


wepeacdtions. «What is needed is a mecnanism that can control 
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the search in a reasonable tasnion. The form ot control may 
Memneuristic in tmat tnere is a feel for wnere a rule snevld 
Be applied. It this intuitive teel can be quantized, tnen 
mies techhnigue may oecome practical. 

KFarlier work by Manna and Waldinger [12] on tne 
DEDALUS automatic programing system also fYrequired ftormal 
problen specifications. Tne DEDALUS system, an implemented 
automatic programming system, utilized only transformation 
moees. A trantormation rule simply rewrites a portion cf tne 
Specification into another equivalent frorm. The continuous 
application of these rules would eventually result in a 
program in the target language. 

4. Input-Output Pair Specification 

Input-output pairs iS a method of deScribineg a 
Pmoorem With @xamples of input and output benavior. For 
example, it someone wanted to describe a program to compute 


miemrapondccl numbers tnen ne could supply the input-output 


pairs. 
(aL 
(Ze) 
(3, 5) 
(5, 8) 
(8,13) 


Mimo OcimeOrna SYNLTOGSIZEer system 15 to cetermine tne 
desired program from the examples of the input-output 
Pemeavior. One approach is to enumerate all possible programs 
in the target language in order and test each program for 


Meee desired bdenavior. Tnat is, test Gach ECnumerated program 
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by giving it the input fron each ot the axamples ana see it 
mae progran will Five tne associated output. Tne enumeration 
Will produce the correct proeram at some point but you 
cannot determine if an arbitrary program can omvrodauce_ the 
desired behavior (seo Biermann [11]). Tnerefore, tne 
following taeorem is given by Biermann, “Tne programs for 
Mae partial recursive functions cannot ce generated from 
sample of input-output behavior. A larg2 class of prograns 
Seeeeoe interred from examples of input-output pairs provided 
they belone to tne class of programs where tne halting 
problen is decidable. Smitn [12] and Summers [13} have 
looked at the synthesis or LISP programs for example 
input-output pairs. It has been Shown that a restricted 
Class of LISP programs can be synthesized from example pairs 
without enumeration over tne class. The reader is invitec to 
review Biernann [14] and Gold [(15] tor tneoretical 
Dackground information. 
Se Sxampie Computations 

Program specification using example ccmputations 
menmows TNOre information to be obtained trom the user. An 
example computation Sea Sequence of. Instructions, without 
mie xplicit control structure, which the user provides tre 
System in order to describe the benavior ne wants from a 
program. Examples are a good communication metnod which 
Mmeople use to describe new concepts or explain new 


processes. To describe a problem to the computer the user 
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Perce navditd ple slastructions and provides an exemple of 
wnat he wants done. Figure 2 snows an example computation 
MmEmrEcemOnst rates now tO compute tne first 180 Fibonacci 
numbers. 

In Figure 2 tne two operand instructions (MOV, ADD) 
perform the action on the two operands and leave the result 
mmeecte first operand. For example, if A = 2 and 2 = 3 then 
MebeAs6b Would result in A = 5 and B = $3. Ail of tne 
instructions perform action on some variables execpt for the 
Peeamr, HALT, and NOTE instruction. START and HALT flag tne 
begin and end of the program respectively. The NOTE 
Mmaeamuction is providing intornation on the reéson for tne 
execution of tne next instruction. 

Tirsemetnod of Specitication depends on tne user to 
Supply more information about tne problem, including the 
meena thm to be syntnesized. Tne algorithm is implicitiy 
defined by the example computation that is given. Tnis 
Specification technique snould be contrasted with the 
previous techniques. Note that the rormal specification and 
Peenrinput~output pair specification only required tne user 
memescpecify the desired benavior witnout specifying tne 
algorithm. Thus it can be claimed that these twe methods 
merentiondalily ignore infornation tnat tne user nas, assuming 


that most users Nave an idea of the torm ofr the algorithm. 
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START Sl ae 2 










MOY As 
MOY Coe 
Pant ge 
DCH C 
PRN ss 
DCR C PRINT 4 
ADD Been 
PRINT B 
DCR C 
; ALD A, 
; Pei eo 
PRINT A 
DCR C 


¢ C>¥ 


Figure 2. An Exatple Computation 
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(he prindiry contributor to tne understanding or 
progran syntnesis nas been Alan W. Biermann (see Biermann 
and Xrisnnaswany [{16] and Biermann, Baum and Petry [17]). In 
particular, Biermann [16] provides a tormal definition ot an 
algoritnam CAat will syntnesize programs from example 
Pemputations. The aleorithn and variations nave provided tne 
basic structure upon which this tnesis nas been developed. 
Peerexay, the algoritnm identifies tne conditions tnat may 
nave inadvertently (or purposely) been lett out of the 
P@wputation. A condition 1S a predicate as daefined in 
predicate calculus. That is, an entity for which a trutn 
value may be measured. Once tne omitted conditions nave been 
inserted, the algoritnan finds GS lave tiling s10r tae 
instructions sucn that a program witn a minimum number of 
mmetructions is produced. To @€xplain tnis labelling, assute 
the instruction ADD A,# appears in three different locations 
in an exanple computation (see Figure 2). Suppose it was 
Known that tienes Nas. ~tO 0G" lwo: ~OCcCUTTEeNCes. O25 /tae 
mieseruction. Then two of tne instructions could be labeled 
Meera 6 6ClLl CU and Cte )|C6CUo tmer.€6instruction Labeled with 4 <2 to 
Mmarcate that the instruction labeled ¢ is different from 
Bieme instructions tlabeled 1. Finding tne labels for tne 
instructions in tne example computations requires an 
enumeration search of all possible labellings. The labelling 
selected is tne first labelling tnat produces @ program that 


is deterministic. 
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Ths aieorithm is compiete and the synthesized 
programs are sound. Completeness means that tne algoritam 
can synthesize every possibdie program. Soundness mnean tnat 
mre synthesize program will correctly execute tne example 
med to construct it. A disadvantage of tnis syntnesis 
method is the algoritnm is an enumeration searcn and in the 
mest case will require Exponential time on tne lLengtn of 
peeeexample computation to find a solution. Techniques neve 
been developed to speed up tnis searcn tnat will produce 
Satisfactory response for most pratical programs. 

6. A General Automatic Programmer Design 

Hemone, 1@dvying this section on dutomatic program we 
wish to discuss a design for an automatic programmer that 
farm Least two of the specification techniques. Tne nane 
of the system is PSI and was designed by a group ofr 
researcners at Stanford s ARtiticia | Intelligence 
Laboratory. Tne researcn effort was neaded oy Cordell Green 
[S$]. Green has presented a nign level aesign of an 
Semoprogranmer that identifies some of tne more important 
areas that neei furtner researen. Green admits tnat the 
mesma was an etfort to focus attention on sone of tne 
sub-areas of tne overall synthesis problem. His modular 
mecmen does focus attention on ditferent aspects ofr tne 
problem. The design decision to split tne overall problem 
into two nain sub-problems of.acquistion and syntnesis is ofr 


particular interest. Tnis design cnoice allows work to 
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ieoceed COnCUrrently on two nard proclems with tne interface 
between tne problems being some intermediate Yrepresentation 
orevne probiemn. 

reo is a knowledge-based program understanding 
mame nm Oneanized as a collection of interacting modules. 
Figure S$ details the high Level modular design of the PSI 
oeerpenm. (he PSI design divides the system into two groups. 
fmerecguisition #roup interfaces with the user and collects 
Mmemsnpecification given by tne user while the syntnesis 
Broup produces @ program in some target ltaneuage that meets 
the user’s requirements. Communications petween the two 
Merrre f&roups is througn an intermediate representation 
called the program model. The goal of tne acquisition eroup 
is to accept tne user’s specification by eitner natural 
language dialogue or by traces, and present a uniried entity 
Momeetne synthesizer group. The implementation On the 
synthesizer Proup fo then. sSimplitied. —fecause ofr tae 
consistent representation it receives. Since the user’s 
murmemeiS CONVESTted into dn intermediate representation tnat 
memscuppiied to the synthesizer group, the user is free to 
meeeeca tron one specification tecnnique to another during 
program specification. 

The overall interaction with tne user iS meant to be 
throue@h natural language dialogue. Since natural language 


uncerstanding 1s not currently within the 
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State of the art, the systen must interact in a subset ofr 
meapurdal languase limited to a particular domain. 

The system~user interaction is to appear as natural 
pemmepossibie. The system naS DeCn designed to incluce a 
mixed~initiative dialogue capability wnich means tne user or 
tne computer can assume tne dominant communication role at 
different times during the discourse. This allows the user 
to provide as much Knowledge as ne can to nelp the syntnesis 
Dpeeeess and allows the computer to assist tne user by asking 
questions or providing responses. The system develops a 
Pees «6oOGeCL €6©6Oflh6Uhtmbe €6user and a model ot tne context tnat 
assists the system in determining when to assume the 
mappa tive and what questions to ask tne user. 

A partial implementation was completed in 1976 that 
included tne syntnesis expert and tne efficiency expert from 
the synthesis group. The acquisition group modules nave 
proven to be amore difficult assignment and only portions 
of the acquistion group nave been implementea. Tne important 
momen Of the FSI design is that it provides a modular 
division of tne progran syntnesis problem tnat nelps provoke 


umoy 210tO these sub~problems. 


Ge VBJECTIVES 
Automatic programmers, which syntnesize programs from 


example computations, require conditions to be explicitly 


defined by the user in order to generate programs wi1tn a 


minimum number of instructions. Previous work ( Blermann and 
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Krishnaswamy [16], and Biermann (18) ) has reduced the 
Mmunber ot reguired conditions, but nas not eliminated tne 
Meed for the user to explicitly state a minimal Set of 
Sonaitions. 

Pieeecxpilery derimition of conditions is mot a natural 
miete of an eCxanple computation. Tnat is, one would not 
normally give cortrol structure information when using 
examples to explain now a@ task is to be pertorned. Our 
objective is to provide an environment where the user may 
qefine tne tasks n@ wants accomplisned without explicitly 
iene the control structures that specify tne tlow ofr 
execution in a syntnesized program. 

We will implement an automatic programming systen based 
Mpom the example computation specification method in order 
to study tne feasibility ot identifying conditions from user 
actions. We limit tnis study to the domain of text editing 
memeotraer to provide a well defined area in wnicn to works. It 
is hoped that the results of our etrorts may provide insight 
into tne overall problem and generate furtner researcnr which 


will extend condition identitication to other domains. 


D. THESIS ORGANIZATION 

The thrust of this thesis is the developement ofr metnods 
for tne automatic construction of conditions necessary for 
the proper synthesis of programs from example computations. 
Example computation is one approacn to the prodlem of 


Program synthesis. Chapter One introduces tne readér to 
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program synthesis and gives a brief nistorical perspective 
@eeetae evolution of this tield of study. Chapter One also 
provides a comparison of the different proposed approaches 
momtnis probien. 

An automatic programmer nas been implemented to support 
[se rese€arcn. This synthesizer was developed to use tne 
example computation metnod for program Speci Tica ti om. 
Chapter Two is a detailed explanation of our parece, 
mmolementdation. Chdpter Two includes a aiscussion of 
techniques we nave incorporated in our implementation which 
eeecd Up tne syntnesis process. 

Chapter Three presents our approach to generating 
conditions Piven an example computation. It describes 
algorithms which will generate conditions from a sequence of 
Saeevor instructions. 

Chapter Four discusses tne result of our research. A 
iret discussion ls imelynaed “On - Tne Merits of “tne 
Synthesizer wnichn we nave implemented and recommendations 
muemeciven for potential inprovement. Finally, Chapter Four 
presents a review of our work on identification and 
@emotruction of condtions from Gxample computations. Areas 
reguirine further research have been nignliented and 
examples of possible applications to otner domains nave been 


pointed out. 





Ti. SYNTHESIZER 


aie Te MS Meow oO=t Olds Durpose Dehind deslenine “end 
building tne program synthesizer. The first directly relates 
momeone usefulness of the Synthesizer. It is hoped that by 
“laying tne groundwork for an autoprogramming system, tne 
impetus will be provided tnat will eventually result in a 
meses GULTOMatic programming eC€nvironment being availabie for 
moe user. This environment is envisioned as an interactive 
Eemmmmconsisting of several conponé€nts: an interface to 
provide tne user with the means to perform example 
Sonpurations , a bo DEetween ) The —Sintertace “dnd Tae 
Synthesizer wnichn records the user actions and transmits a 
meace of those actions to tne syntnesizer, the svrtnesizer 
itself? whicn produces the algeritnm in some internal form, 
Murer anadity, a translator tnat recé€ives tne internel 
Bepresentation of the algoritnm and tYransilates it into 
machine-readable form and/or user-readable form. The second 
purpose for wnicn the syntnesizer is built iS5 to orovide a 
Suitable venicle to be used in the main area of research 
Mewes thesis explores. If an autoprogrammer can generate 
merrect aleorithms from example computations, now much can 
femwaone to relieve the user from naving to include brancnineg 


Peombooping conditions in his exanple computations? 





eee OVERVIEW 
1. wzeneral Descr n 
An automatic programninge system whicn produces 


prograns based upon tne user’s input of example computations 


has a natural appeal. Example computations are seaguences ct 
moose ructions pertormed Pe ene err bam ic 'Mamnere. “For 


memmce, tf toe user is doing @d matrix multiply, computing 
miemeentry tor the resultant matrix involves the sur of 
Mmerancts fron tne appropriate row and column ofr bee 
multiplicand and multiplier matrices, respectively. When 
eeeroas ConmmMunicate ideas to eacn other, the proper use of 
ewample computations often plays a vital role. It is hard to 
Miseine trying to explain tne metnod of multiplying two 
Meme irces togetner, or trying to explain the concept of 
feeeomoset reldtionshnips witnout being able to draw exemnples 
that ennance the2 explanations. This metned of communication 
Moons to bE vital to NRuman uncerstanding of adaleoritnms. 
Since programmers often use small examol® computations wnils 
coding programs, it seemsS tnat a logical approacn to 
automatic programming would consist of the macnine coing tne 
actual program synthesis bdased upon example computations 
given by tne programmer. 

Peoctramesynmt7eSis 15 tne act of putting instructions 
Momether in such a way tnat an algorithm is tuilt which 
accomplisnes a desired task. Orviously, an alxzoritnm which 


ms an €xact replication of tne s@eyuence of instructions will 
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Panola st UNS task, but 1 25 uninteresting since it cannot 
be generalized to accomplish a set of related tasks. For 
mudmioee, a Lin@dr sequence of instructions wnich muitiplies 
meee X «£ Matrices together will only work for <¢ x 2 
meter iscess Nowever, by allowing loop constructs and if-tnen 
constructs, an algoritnm can be produced wnich performs tne 
more general task ot multiplying any two matrices witn legal 
momma column dimensions. SO, in the case of tne matrix 
multiply, tne task of the program syntnesizer is to produce 
a general matrix multiply algoritnm given tne example 


Somupuratlion for a2zx ¢ matrix multiplication in some form 


Such as: 
Clee mone ds bil at +oaft,2) * b{2,1] 
eee = ttl.t) “bll,2) + ali,2) * bl2,2] 
lien easate,tietebil.l)| + a(2,2] * p{2,1} 
oie lie= sae) s bil,2c) + al2,2}] ~ bl2,2) 


SemeTalizinewenrrom tne example computation also 
requires some means of noting when tne array bounds Nave 
been reacned for this example. In otner words, conditions 
nave to be interposed between some instructions where a 
Sragec in the flow of control for tne algoritnam 1s 
necessary. An input trace is defined as a Sequence of 
maotuructions and conditions whicn describes tne example 
conputation. In the matrix multiply example thiS mient be 


accomplished tnusly: 
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Cer aw) 
Cree ere dole > BE deed: | 
Ciliplieeee Ati 2. BT 


Cees 
Cit ds] 


i Ot 


COND - col index of A = col size of A 


Cli,2) = 2 
Citecie=oc (cl ee Bij * Bi1.2] 
Gitee iene ecilec | +ehpd cj B22] 


COND = col index of A = col size of A 


iene = scllenel skies) = Bl2,21 
COND =" row a. CcOl index of CG = Dimension of C 
Ser 

The program Synthesizer used for this thesis is 
designed around concepts and ideas on syntnesizing a program 
given example traces as described in reference {17]. 
Previous rése€arcn, references [16], {17], and [18], seems to 
indicate that correct programs can oe synthesized on the 
Geasiss Of relatively few sample computations, but tnet tne 
amount of time required to do tne synthesis grows very 
Seca ay dS a function of progran complexity. 

es Trace Coding 

Tne syntnesis procedure is domain independent; that 
is, the input trace can be coded into any consistent 
representation, and it will not affect the operation ofr the 
Synthesizer. Since the synthesis procedure is independent of 
the input trace representation, alphanumeric characters will 
memused tO represent instructions and conditions. They are 


distinguished from each other by their position within the 
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trace ratner tnan by tneir symbolic representation. Fer 
example, an ‘a mignt represent an instruction or a 
@ondition. Within the instruction set itself, identical 
mmeprictions are encoded as identical symbols. A simple 
meace Of a routine to find all poSitive numbers in an input 


strean nignt be: 


A = @ 
READ EF 


COND 


B is negative 


A=A+ 1 
READ RB 


COND ~- B is negative 


A=A+ 1 
READ B 


COND = F 1s positive 


PRINT 5 


? 


It the instruction A=A+1l is represented dy @e “bd, eacn 
@eeurrenc® of that instruction in the trace will nave to be 
represented by a “b°. Tne reason for tnis constraint is 
obvious. Since tne syntnesizer only receives a trace or tre 
meamore €xecution, it cannot determine wnetner A=A+1 is the 
Sane instruction being encountered repeatedly in a loop, as 
it is in tnis example, or wnetner there are several 
independent occurrences or A=A+1. Figure 4 iS &n example of 
a typical coded input trace. Tne left-nand column entries 


meemecOnditions and tne rignt-~nand column entries are 
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instructions. Fieure 4 is read aS State °S” transistions on 


a a 7 


condition ’°x” to state ’a° wnicn in turn transitions on “x 


to state “b’, and so forth. 


transitions states 


HW oA OW OWS OW OH OR HOO 
YAmMoRetOoOoNRd WMYNUAcTDTADGCA Ngo Nh 


Preunre 4. inputstrace 


5. Input/Output Trace Representation 


A Moore-type representation, as defined in {17}, can 
be used to highlient certain features that must be dealt 
Meepmeewnen producing an algoritnm from an example trace. 
Throughout the rest of the discussion, Moore machines and 
algoritnms will be used synonymously. Conditions relate to 
transitions and instructions relate to States of the 
macnine. In fact, tne function of tne syntnesizer can 0e 


Viewed as that of determinine a minimum-state deterministic 
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Moore machine equivalent of a non-detrerministic Moore 
machine. Representing input traces as Moore machines will 
meeten snow the non-deterministic structure of the example 
trace. Dads non~determinism must be resolved by tne 
Synthesizer in order for an algoritnm to be egenerated. 
Figure 5 is the Moore machine representation of the invdut 


a 


trace of Figure 4. Notice that at node “bd, the trace is 
non-deterninistic. Transition “y” leads from node “b” to two 
different nodes; Similarly, transition ‘x’ leads from node 
“b” to two separate nodes. Figure 6 is tne deterministic 
Moore machine which has been constructed by our synthesizer 
based upon tne input trace given in Figure 4. The 
non-determinism haS been resolved by splitting state ‘a 
into two states distinguisned from e€ach other oy an integer 
prefix label. The assignment of the prefix label iS the 
mecnanism used bY tne syntnesizer to prevent 
non-determinism. In order to accomplish this assignment, the 
MemeenesiZer uses an @€numeradtion technique. Each instruction 
is assigned a prefix label in a manner that maintains 
determinism and assures that the algorithm will correctly 
execute tne input trace. [t is easy to verify that tne 
deterministic Moore machine of Fieure 6 will execute the 


trace. 
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Pre sew on-Geterministic Moore Machine 








Figure 5. Deterministic Moore Macnine 
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wee SINTHESIS PROCEDURE 
meer function 

iocummnctlONMOr ae Sylthesizer worogream is to 
provide a minimum—-State, correct program consistent with the 
Peete vrece 9of the e@xatple computation. Tne syntnesis 
process will be completed wnen it is determined which 
maemrrence ot ad idbelied instruction corresponds to eacn 
Paeereular instruction in the input trace. In order to 
meeompiishn tnis goal, tne Synune oi Zed is bas leaily 
emmerured as a d@pth-first se@arcn algorithm. Backup and 
tixup mechanisms exist to enhance the searcn proceture wren 
mamene NaS not kept tne algorithm from traversing 4 
truitless branch of the search tree. Tne searrh necnranism 
mememmes CO assign a label to G@ach instruction in sucn a 
Manner that tne generated algoritnm remains technically 
Sommreet, tiat 1S, nondeterninism is not allowed to exist and 
Demo rinag! trace can still be executed. A number cf 
maemmeecues 2xist within the synthesizer which ait prunine of 
mummesearcn tree, dnd thereby mare 1t possible to svntnesize 
more complicated programs in a reasonable amount of time 
momemconuld otherwise be expected from a ¢e@neral enumeration 
bechnique. These techniques orfset tne major disadvantage of 
Peagm@edgtiat frowth of the s@€arch space a5 4&4 function ofr 
mapuitee which iS fount in a general enumerative search 


technique. 
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Eee CONCEDUS 
Certain detinitions and concepts must be presented 
Memores tne actual algorithm is discussed. In order to 


mem tate th2 discussion, it iS necessary to refer to 





Meeure 7“. Sach level in the fleure consists of an 

, Teterrec to as )ar 
mec rn Fieure 7 tre leftmost symbol under I=-C-I1 is 
Mmierrea to as the leading Instruction cf tne triple, tne 


meaale synbol is the condition, and the rightmost svmbol is 
memperailine instruction. Tne trailing instruction at level 
i becomes the leading instruction at Ilsvel itl. So tnis 
input trace represents tne instruction-condition sequence “s 


# 


Pe So A fc ee 


1 S24 
2 ans 
O sTa 
4 ada 
5 axa 
5 aya 
“ axa 
S anr 


feeure 7. Instruction=Condition-Instruction Triple 


Two levels i and j are Said to belong to tne same 


couple-class if the elements or the tevel are the same. 
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mesure tlon Clements of tne trace wnhicn are in tne same 
Seuplie-class may be assigned the same prerix label during 
marmroe Sis if tne assignment does not cause non-deterninisn. 
For example, given tne trace in Figure 7, levels 1 and S$ are 
Mmomeune sane couple-class, as are ievels 5 ana 7. Difference 
Geerelations are another situation that can exist whicn is 
Mempeacerest. ne first two elenents of level i and level j 
are the same, but the third element iS not the same. A 
difference set relation indicates that the leading 
instructions cannot be represented by the same Stare 
regardless of tne prefix tlavoel assigned during syntnesis 
because the leading instruction has tne same transition to 
two different trailing instructions. Again using the above 
muateemmerevyel 2 and level S$ tall into tnis category. In this 
Situation, tha index & would be entered into the diftrerence 
momemome1eve!l 2. By implication, the ind®x 2 ig also in the 
Miererence Set for level &, altnouen, in practice, it is not 
entered. 

Once the POLE aes COUDle=Class. intormation -and 
difference set information nave peen determined, additional 
ditterence set information can be obtained tnrouvgen tne 
Chaining nature of differencing. For example, suppose tne 
trace consists of tne one snown in Figure 8. Tnen tne Moore 


machine representation of this trace is snown in Figure 9. 





index trace 


5 axa 
6 axa 
Z ays 
= axa 
9 axa 
12 ayt 


Propre =&<s Chaining of Difference Set kelations 





Figure 9. Non-deterministic Input Trace 


Mmisemicacnine ts, obylously nondeterministic . since 
State “a’ transitions by “y to two difterent states. 
Difference set resolution requires that tne index for ‘ayt™ 
be in tne difference set ot “ays”. Since tnat reauirement 
causes different states to represent the (ein “ayt- and in 
“ays”, and furtner since tne trailing ’a’ in tne preceding 


level is exactly the Same instruction, the preceding levels 


now satisfy tne difference set relation. The leading 
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mistruction and the condition are the same, but the trailing 
mist ruction in the I=-C-I triple is different since tney nave 
Beeviously been assigned t0 a ditference set reléetion. 
Therefore, tne lead instruction must te tlabelled with a 
feeeeerent prefix during assignment and simitarly, tne levels 
above them. So the Moore machine will now be deterministic 


ememrn tne following form. 


Figure 12. Deterministic Trace 


Givemed nertidgdl trace derived from tne example 
execution, tnere are numerous Moore macnines that can be 
momemrncted to satisfy tne trace. At one end of tne 
Spectrun, 2 program can be constructed sucn tNMat eacn 
BlCcGeeaine state is assigned a different prefix label. This 
Mmearoa diways results in a Sstraigrt-line pregram. Each 
maser uction nas one transition enterine it and one 
Meansition exitine from it. Allowing this metnod produces 
the maximum size program consistent witn tne input trace. 
meemerieure i1. This is not 4 particularly desirable metnod 


Since it does not recognize loop structures that can 


Slenificantly reduce tne size ot tne program. Additionally, 
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Boemldes the basic Structure otf the algoritnm. The major 


advantage, of course, iS that absolutely no searcn is 


mequired to produce a deterministic machine. 


condition PES Pkuc ta of 


= a x x i 
ae) Gy ® 
x a 

x a 
Preure lia. Trace PLeure oli b.2roeram 


Peerless lise SoUraLent-line program 


On the otner end ofr the spectrum, a program can te 
momesreeected such that eacn identical instruction receives 
the same prefix label. This method takes full advantage of 
mopeestructures, and will result in a minimum state machines. 
However, such a method will seldom produce a deterministic 
machine; tnerefore, it will not preduce a satisfactory 


€algoritnhnm. See Figure id. 


level GONGs -LNStr 





1 = a 

2 x a 

5 x a y 

4 x a 

mM: OS 

6 y Dd y 
iieaure l2a. Trace Pireure- 120. Procren 


Figure 12. Minimum State Machine 
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Tne pest solution lies somewnere between these 
Pmapoints. A reasonable tirst guess at the number of states 
reyuired to produce a deterministic macnoine within tnis 
Spectrum can be made by establisning a lower bound on tne 
nmunber of states. Tne cardinality of the instruction set is 
@etined as the number of ditrferent instructions appearing in 
the trace. Using tne above figure as an example, it can be 
ferermined that tne cardinality of tne instruction set is 
two; tnat is, there are two different instructions, “a and 
“b°, in tne trace. Tnis neasure provides an absolute lower 
bound on the number or states required in the rinal macnine. 
This lower bound can be refined by determining a lower bound 
On the number of States needed for each individual 
iieseruction. Once again, using the above figure as an 
example illustrates tnis concept. Tne instruction “a” at 
level 5 must be dittferent than the instructions at levels 1 
tnrouegn 4 because of difference set resolution, or else 
nondeterminism results on the transition “y’. Therefore, in 
order to maintain determinism, tne instruction “a” must pve 
mowed, at least two states. Summation of tne lowér tounds 
for each of tne instructions gives a lower bound on the 
total number of states required for tne machine. For tnis 


particular example, the program would be generated 3s: 


00 








Hieewre wis 1LUStruetion set Lowsr Bounds 


If tne searcn space is viewed a5 a tree s«ctructure 
then the levels ofr the tree can be associated witn the 
mMicGreuctlons Oy assigning the flrst instruction in tne input 
meee tO the first level, the second instruction to the 
meeemoemetevel, and so fortn. Tne brancninge factor at eacn 
mevel is the state lower bound computed for the instruction 
seen at that level. The- prefix label assigned to tne 
Piecwouction is represented dy the specific branch used to 
traverse to the next level. 

Mic iTGed Of Turoayiding a tower bound on the numcer of 
Clete S cos Some. 


states leads to an iteratively expvandine de 





Wnen all possible combinations of prefix Labels nave been 
tried, but tne algorithm remains non-deterministic, tne 
mowers COOUnd 15 incremented and the search is restarted trom 
the top level. When the lower bound is increased, the search 
meeewootdins additional paths to the final solution ty 
increasing tne branching factor associated with one or more 


mmeteriuctions. Ihe depth of a SucceSSful Search into the tree 


Spe 








is restricted by the lower bound on the number of wnodaes 
required by the deterministic macnine. Only wnen a pattern 
of prefix assignments nas been mace wnhicn allows tne 
algoritnm to remaun PevCermiIoistic and abi =o1 Vine 
instructions in tne original trace nave been assiened prefix 
lapels will tne syntnesis terminate. Tnis mecnanism prevents 
a straight-line model trom bein2 output as the alzorithnm 
eieeeose lt 1S the only one tnmat can satisfy the input traces. 


More importantly, et provides tne minimum-state 


deterministic Nacnine capable of executing the input trace. 


tees NTHESIZER STRUCTURE 

The synthesis program is subdivided into two primary 
meauless Static processing or the input trace; and dynamic 
BRocesSsing of the information extracted from the input trace 
Mameeiem preprocessing, or static processing pnase. Static 
processing provides information such as couple-classes, 
difference sets, and lower bounds on the number of macnine 
States. Dynamic processing uses Knowleige inherited trom 
preprocessing to guide tne searen mecnanism toa final 
out ot the aleorithm. These two modules will be discussed 
in turn, and the primary mecnanisms involved will be 


amplified. 


imeeotatic Processing 


ptatic processing can be conceptualized as 
consisting ot tnree main functions: (a) accept the input 


trace; CUjeepEaomecessmuune trace for dittrerence, sets, 


a2 





couple-classes, and state pounds; and (c) prepare aie trace 
mipbe tor turther use by dynamic processing. Once tnis 
Meeprocessing has been accomplished, the static module is no 
mommper necessary to the syntnesizer. 

bine ecUbnent cOntiecuration, the Static module 
EXDeCcts Ge find Ale input as a Sequence of 
miessuetlon—condition-instruction triples. Fieure 14 is 4n 
example of an input trace. 

level trace 

anp 
psa 
aga 
ayr 
ror 
rst 
rra 


aga 
ayt 


CONAMTP NAW PF 


Fucure 24. ‘Typical Input to Static Processor 

Bacn line consists of a triple, for example ‘anp’. 
The “a” represents an instruction, tne “n° represents. the 
Condition wnicn causes the program trace to transition to 
the next instruction “p’. For eacn level, tne first element 
represents tne same instruction as tne last element of the 
Meeceaune Leyvyel. This is easier to see if tne above trace is 
represented as a Moore macnNine in which the nodes are 


¢ 


mrstructions and tne conditions are transitions. State ‘a’ 


a 7 a 


transitions on condition “n” to state p which transitions 


a 7 & id 


on condition $s to State a which transitions on condition 


a ? 


memback to State “a, etc. 
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Figure 15. Moore Machine for [nput Trace 


peeve! trace nS irene renee set 
1 anp ~ = 
z psa = = 
3 aga 1 er 
4 ayr — ty 
5 nse Z - 
6 rsr 2 = 
i" Prd - ~ 
& aga 1 = 
9g ayt ~ = 


Figure 16. Intermediate Trace Table 
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Men occurrence of an instruction symbol in the input trace 
Meme presented by tne same state@ at tnis point in tne 
Synthesis. 

Once wime DnNpusc trace nas been accepted, static 
meocessin2 can begin. Static processing GoOns is Us Os 
deternining the level indices associated witn eacn 
Somore=classS and with each difference set. For tne trace of 
Mereure 15, tnese are shown in Figure 16. 

IMiclemcdremuwOrmcOUDLeG=Classes In This trace. They are 
[aga] at levels S$ and 8, and [rsr] at levels 5 and 6. The 
Memagining levels ar@€ not assigned to @ couple-class pecause 
Mowotnher levels matcn with them. Couple-class information is 
useful to the dynamic processor for determining forced 
Seolignnents and dynamic non-equivalence. These ideas will ce 
discussed more fully in the section on dynamic processing. 

Mite rence "Sets Exist for levels 3 and 4. Level 4 
maemeece difference set wnicn contains the index 9; that is, 
meceelement at level 4, “ayt’™, nust mave a different pretix 
label on ’a” than tn® element at level 9, “ayt’. It tne “a” 
mS not latelled ary rerently during tne syntnesis, 
nondeterminism will result Since the Same transition would 
lead to different nodes. 

Difference set resolution is a very powerful 
mechanism for ensuring deterministic benavior of the 
meoritnm. A considerable amount of tne prefix Papel 


assignments to the nodes can de resolved using difference 
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mes. NOTICE that level & appears in tne difference set for 
Meyel 5 even thougn levels $ and & are in the same 
Meprppbe—-class. At first this appears contradictory since 
equivalent couple-clasS nameS imply tnNat tne elements are 
Mmiemesane, Out ditference set existence forces tne lead 
m@erructions to be different. Tnis points out the relative 
power of couple-class irformation and ditference Set 
mot Ormation. Difference Serpe latonrmeavion 2S “inmuyeble-. 
Souple-cless information only hints at equivalence. In this 
Pewearnecwlar Example, tne entry at level 3 was caused by tne 
@aainine effect of difference set resolution. Notice that 
since tne “a” at level 4 nust be different tnan tne ‘a’ at 
level 9, ani notice that Since the trailing “a° at level 3 
Meee by definition, the same as the leading ‘a’ at level <4, 
tne trailing ’a° at level 3 cannot be tne same as tne 
trailing “a” at level &} therefore, ievels 3 and & cannot be 
maetne same couple-class. 

To compute the lower pound on the number of states 
mueuae aleorithnm, tne mininum number of states needed for 
Perereetostruction iS Summed. For this same e€xarple, tne 
memerieti.on set consists of 1a,p,r,t}!. The bounds for 9),Tr, 
and t are eacn 1. Tne bound for “a” is 2. Tnere must be at 
least two different occurrences of “a from the ditterence 
set resolution. Therefore, tne minimum number of states with 


which a deterministic Moore machine can be constructed for 


mars trace is 5. 
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Finally, StaGic proces sino passes all tne 
menormation concerning the input trace to the dcynamic 
mrocessor via a trace table in tne following ftorm. kEach 
level nas only one associated condition and one associated 
Meme tion. Since difterence set intormation is associated 
wath the lead PHS UrUCT? ON in an 
mmcenouectioOn—~Condition-instruction sequence, it is @ntered at 
tnat level. Since couple-class intormation iS associated 
meeoeeecne Entire instruction=-condition-iInstruction sequence, 


it is associated with tne GCrailine “condi trvon-instruction 


bait . 

level condition instruction Cm Cum t Ne Rem Cerss et 
1 = a = = 
a n p = = 
S S a = i8} 
4. 2 a 1 19} 
5 y r a = 
6 . E 2 = 
ie S i Z = 
8 Ir a = = 
9 g a a = 
12 y t = = 


Figure 17. TraceTable 


2. Dynamic Processing 


Dynamic processing involves assigning prefix labels 
to the states of the macnine. In thiS way, Separate 
Mmecurrences of tne same instruction are differentiated. Tne 
Myaamic processor is the searcn mechanism Hor: the 


emcnesizer. It operates in such a way that, at any point in 
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the synthesis, the portion ot tne trace previously processed 
memmesents @ deterministic Moore macnine. In order to 
maintain the determinism, dynamic processing steps tnroughn 
tnree pnases:(1) assignment of tne prefix label to tne 
instruction; (2) ditference set resolution, and (3) dynamic 
equivalence assurance. Additionally, e¢acn of these pnases 
MmeyesoOullLt in fixup and backup conditions associated with 
them. Tne fixup/dackup conditions encountered durineg 
eeeererence set resolution or qurine dynamic equivalence 
eheckine are indicators that, if tne current assignments 
metain the san€, a nondeterminisn will occur in future 
asSignments. AS Such, they inform the pruning mecnanisms ofr 
the searcn algorithm. 

Mean teeral part of the dynamic processor is tne 
failure memory. It controls the searcn. Tne failure memory 
Pepeeeem conceptualized as a L x M matrix where Lis tne row 
peze and corresponds to the number or levels in the trace. 
Bacn row nas M columns wnere Mis equal to tne lower pound 
feoeeneod tO the instruction contained on that level of the 
metcen An Entry into the failure memory at some level i and 
some column j, where 1 <= i <= L and 1 <= j <= M, prevents 
mee assignment of j as a prefix label for tne instruction at 
level i. When a failure memory cell contains an entry it i15 
called a valid cell; otnerwise it is invalid. Sach cell of 
mae ftfaliure memory is @ twor-element entry. Tne structure 


factor is the first element. It indicates wnichn level of the 
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maaece Caused tne entry. The free state factor is tne second 
peeve nt., AS tn® name indicates, this element is a function 
meeeeune number ot free states avdilaple at tne time of 
assignment. Tne specifics of tne failure memory operation 
poo che nature of failure memory entries will be discussed 
througnout tne rest of the section as Sach phase of the 
G@ynamic processor is discussed. 
a. Label Assignment 

AS previously néentioned, label assignment is tne 
merst function provided by the dynamic processor. A label 
mrmeeement Can ve C€lthner forced or arbitrary. Additionally, 
the assignment can reSult in the creation of a new State, a 
label-name combination not seen pefore. A forced assignment 


occurs when the instruction at the current working level is 





a nember of tne same couple-class as an instruction at a 
Cmeomemlevel, and the lead instruction into both of those 
levels nas tne same label assignment. Tne current working 
level is defined as tne level ot tne trace wnicn contains 
tne most recently assigned prefix label, but difrerence set 
resolution and dynamic eyuivalence cnecking nave not been 
completed at that level. An example is given in tne trace 
Shown in Figure 18. 

Tne label at level 7 is forced by tne lapel 
assignments at levels 4 and 5. Notice tnat the instructions 


at level 5 and at level 7 are in tne samé€ couple-class, 
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level cCongiti1 on Justi rvetion C=C lapel 
4 a a - 2 
5 n r %) i 
5 1g 3 = Z 
7 n c % ol 
Q r a + ae 


~indicates torced move 


Figure 18. Partial Trace Labelling 


and that the instructions at levels 4 and 6 nave tne sane 
meetix label. This condition forces the instruction at level 
eooenavyve tne same pretiz lapel as tne instruction at level 
5. The Moore machine representation of the partial trace is 
snown in Figure 19. Tne assigennent at level 8 is also forced 
for similar reasons. By derinition, any forced assignment 
involves previously assigned states, labdel-instruction 
combinations, tnat nav2 been seen berore;s tnerefore, nro 


forced assignment can result ina new state. 





Figure 19. Partially Determined Moore Machine 
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ii activites Temory Can be wsed in Conjunction 
mma tOrced assignments to Signal a dDackup condition to the 
memmem. ti tne failure mnenory entry corresponding to tne 
label assignment at tne current working level is valid, then 
mmecomuradiction results fron the forced assignment. Suppose 
that the trace table and tailure memory are as Snown in 
meeure 2c, and the forced assignment at level & nas just 
peen nade. Tne entry “1.1° at row 2, column 8 of tne failure 
menory is interpreted in the following manner. The integer 
meometme Left of tne decimal indicates tfat the entry was 
caused by the current asSSignment at level 1. The “°1° to the 
rignt of the decimal point is the number of free states + 1 
emir e when tne assignment at level 1 caused tne failure 
memory entry; therefore, wnen tne entry was made there were 
Memeer rec States available. A free state is one wnicn nas not 
been bound to a particular instruction. 

Tne assignment at level 8 is forced. In other 
words the sequence of the previous asSiegnments causes tne 
Mpetix label of the instruction at level 2 to oe a =e. 
However, the failure memory containsS an entry at row & 
column 2, FM(8,2). Tnis entry indicates tnat tne instruction 


at level & cannot be asSigned the label “2°, for if it were 


to be assigned a °2°, a nondeterminism will result. To 
Besorve the conflict, backup is initiated until tne last 
unforced assignment is found. In this case, tne backup is to 


level 6. 
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The assignnent at level 6 wiil be changed and tne searcn 


meee CONtinue trom there. 


mpace Table Failure Memory 
meveil cond instr ele label oa z so) 

4 a a = a = = = 

So n E 3 1 a = = 

6 r a 4 2 = = i 

iC n i 3 eat 5 ms = 

= r a 4 ae = Dass. or 


® 


Figure 20. Trace Table/Failure Memory Configuration 
for a Forced Assignment 


PemememaSStenmeny. 15, 500% ferced, tne tailure 
memory row corresponding to the current working level is 
s@earcned for tne first occurrence of an invalid cell. An 
invalid cell is one which does not contain a failure memory 
emury. (tf a cell is invalid, tae assignment of ae prefix 
label corresponding to the tailure memory column index for 
that cell is possible on tnat level of tne trace. The column 
Mmmmpen of tne first invalid cell becomes tne label 
assignment for the instruction at that level. For example, 
Suppose lev®l Sis the current working level and the tfrace 
table and failure memory have the configuration snown in 


Figure el. 
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Trace Table Wailure Memory 


mn a 


mevel cond instr ne ia 3 & 
5 iG a de 4.1 a = 


Figure 21. Trace Table Entry Snowing 
Arbitrary Assignment Metnod 


Riemer otelnvcd Moen tRyeinetae tallure memory on 
row 6 is in column 3; therefore, instruction ’a’ tor level 6 
Weemebe assigned a prefix label ot 3. Tnese non-forced 
assignments may result in the creation of a new states that 
is, a labelwrinstruction pair not previously assigned during 
Poemesynthesis. If, at some tuture point in tne search, a 
backup is initiated that reacnes tnis level of tne trace, 
Poem obaCKkup mnechanism will not stop to perform a retry. At 
any point in the syntnesis, all previous levels have 
mecelved assignments bas@€d on the constraint that tne 
minimum number of states nas been used consistent with 
Maintaining determinism; tnerefore, assigning a aifferent 
pretix label to a state wnicn has been detined as a new 
State only changes tne name of the state, and does not 
Srange the structure of tne algorithm. Since tne structure 
of the algoritnm has not been cnanged, the cause or the 
menagetverminism is still present. 
One other type ot assignment snould be mentioned 


petals pDOint. Pseudo-assignment occurs wnen tnere is only 


63 





one invalid cell lett ina failure memory row ata level 
otner than tne current working level and tnere are no free 
states available. Altnougn pseudo-asSignment does not 
immediately cause a label to ve assigned to the instruction 
at that level, it does Simulate a look-ahead mecnanism tor 
the searcn tecnnique by triggering difference set resolution 
and dynamic equivalence checking as it tnat level ot tne 
trace were assigned a value. Since the pseudo value is the 
omy value currently possible for tnat level, it a backup or 
fixup condition is encountered during pseudo assignment, tne 
assignment mecnanism can inmediately try anotner label at 
the current working levels thereby saving the unnecessary 
search of a path which it already Knows to be nonproauctive. 

Once a tentative label assignment nas been made 
to the instruction at the current working level, ditference 
Bereresolution and dynamic equivalence cnecking can be 
performed. Althnougn tnese actions may cause a fixup on the 
meetixr iabel at tne current working level, tneir primary 
mmo se 15 to furnish information to tne tailure memory that 
MmeeoenmelpD guide future label assignments. 

Bee solrrerence Set Resolution 

Dittewe hence Set Resolute. on prevents future 
assignments being made that are known to eanse 
nondeterminism if tne current assignments remain unchanged. 
Difference Sets outline a significant portion of the 


meructure of the input trace without regard to label 





assignments in tnat they prevent nondeterminism from 
Becurrine as a result of the same transition out of a state 
leading to more tnan one tollowine state. Consider Fivrure 


eee 





Figure 22. Nondeterministic Input Trace 


There are several instances wnere difference set 
mMesolution will force a state to be split into two or mcre 
different states. States ‘a’, °2”, “°p’, and “t° all nave 
nondeterministic transitions associated with them. The trace 


more and failure memory configuration for tnis trace is 


Shown in Figure 23. 
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Trace Table Fallure Memroryv 


a a re ate 








meer CONG instr c=c difference set lacrel ue é 3 
1 - a - 13,5,15,18} 1 - - - 
2 n Dp 1 14,11} it 

3 S a 2 TS) ors 2 ek 

4 n D 1 {11} 2 eel 

> S a 2 - 2 ees Oued 

6 n ic 3 = i 

ic S a = = 

3 y g = Vom ecs 

Q r g 4 {19,2¢} 

12 r 2 4 {20} 

11 iF D 5 {21} 2.14.1 
ie Ss t im Veo Pee dc | 

13 S t 5 a2} 

14 S t 6 = 

ES S a a — leo ack 
16 n r S a 

i? n ic = = 

18 S a ? - Take Oak 
19 n r S = 

20 p g = = 

el i D o = 4.1 
Ze S a Z = 


Figure 23. Trace Table/Failure Memory Configuration 
After Assignment at tne Fourtn Level 


As dynamic processing proceeds Wet lebel 
meosrenments, difference set resolution occurs. Pirrerence 
sets are resolved by making an entry into the eailure memory 
memeeer, tne level corresponding to tne difference set 
elenent, and the column corresponding to the pretrix label 
meotenmcd to the instruction at tne level from wnich the 
Giftference set is being resolved if the cell has not already 
been made valid through a previous assignment. For exemple, 


if tne prefix assignment at level 1 is a “1°, tne failure 


memory entries are made in column 1 at levels 5,5,15,1&. 
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Sinilarly, wnen tne asslennent “1° is made at level 2, 
fallure entries are made at levels 4 and 11. Now wren the 
mime nnent at level 5 1S made, tne dynamic processor will 
mot try to asSien a prefix value of “1° Since the failure 
menory cell at (3,1) is valida. Tne assignment will 


? 


automatically be “2°. Notice tnat at level 5 tne previous 
assienments have caused the pretix label to be a “S*. In 
otner words, tne failure memory nas caused tne searcn tree 
to be pruned So that an asSignment ot “1° or “2° will not be 
tried. bither one of these assignments would have resulted 


Poe nondeterminism being introduced into tne trace at level 


6. 


Figure 24a. Prefix Label Equals i 
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Figure 24b. Prefix Label Equals 2 
Figure 24. Nondeterministic Pretix Label Assignments 


While failure memory entries are being made 
Mmoer aifference set resolution, it is possible for a fYrow to 
eee =~ CCLIS)6OCValid Except one. This has peen previously 
defined as a Situation leading to pseudormasSienment. This 
Brviation nas occurred at Level il in tne example Siven in 
Ptenre 25. When such an occurrence happens a look-anead 
mechanism is triggered to resolve the difference set at tnat 
level. In tnis example, tne failure memory cell at (21,3) 
nas been validated with an entry which indicates tne current 
working level as level 4 when tne pseudo-assignment occurred 
at level 11. Another situation which can occur ina failure 


menory row is wnen all the entries in the Trow become valic. 
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Meee COndition is calied an incipient fence. Yhen an 
Paci pient fence exists and tnere are no free states 
available, then no assignment can be made at tnat level. 
This condition is called a fence. 

DincCsmeulewesearce  9mecnhnanism always. Knows: the 
mevel from which it is doing look—-anead ety ditferencse set 
mesorution, it is able to perform a2 fixup on tne lapel 
Meemmermont at the earliest possible time. A firup is 
mememor1sned by incren€ntinge the prefix lapel bv one. I? an 
entire row in the failure memory becomes valid and there are 
momer ree States available a fixup must be performed on tne 
label assignment at the current workine level. It the label 
mS Lert the same, then wnen tne search reaches tne fenced 
level, no assignment will be possible. Bach time a fixup 
occurs, all entries made in the failure memory as a result 
Of the previous label asSienment are deleted, and entries 
are then made based on tne new label. 

e. Dynamic Equivalence 

Couple=class information furnished by static 
processing aids in tne determination ene dynamic 
nonequivalence. Dynamic noneyquivalence can occur during the 
Synt~nesis at any level below tne current working level when 
the couple-cClasses are equal. Dynamic eyuivalence results 
when moStUrUuctions in the sane couplée=class heave been 
assigned the same prefix label. Consider Figure 25. The 


feces triples at levels 5 and 6 and at levels 11 and le are 
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[aga]; therefore, they are in tne sare couple-class. The 


a a 


mous tcuction a at level 5 nas been assigned a pretix of 


a 


“2”, and the instruction ‘a’ at level § nas been assigned a 
meefix of 1. Now, if tne instruction at level 11 is 
assigned a prefix of °2” and tne instruction at level 12 is 
asSigned a prefix of “1°, dynamic equivalence will occur. 
hmemrner, the assignment at level 12 willi be forced. Dynamic 
MiecauivValence results when such an assignment scnemeé 
causes non-determinism. Dynamic equivalence cneckineg 
Pome tions as 2 look-an@ad mechanism by preventing tne future 
occurrence of a forced assignment which will result in 
nondeterminism. Suppose tne syntnesizer is inspecting tne 
trace in Fieure &, ani nas just assigned the instruction at 
level 6 a prefix of “1°. 

Notice that level 12 is in tne same couple-cilass 
as level 6. Since tne instruction at eacn of these levels is 
mempmme sSaMNe Couple-class, the possibility exists that tney 
may be the same instruction. If the instruction at level 11 
is assigned a lapel of “2° when the working level reaches 
moepart Of the trace, then the assierment at level 12 will 
be a forced assignment of “1°. However, an entry nas already 
been made in tne failure memory at (12,1) wnicn indicates 


that the instruction at level le cannot be assigned a prefix 


Mepei of i. 
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TraceTable Failure Memory 


st nn 


weve l cond instr c-c Heeoer oh 2 = 
5 a a 1 2 a SS = 
6 S a 2 il = = = 
a g a ies = = = a 
eb f. a = = Gn 

12 2 a Z = 4.1 - — 
eS n a 3 = = = = 


Figure 25. Trace Tadle/Failure Memory 

Piporder sO eavold: “this  vcontradiction “and sd 
backup, dynamic nonequivalence processing causes an entry 3t 
(11,2) of tne failure memory wnicn corresponds to tne 
labelling of °2° given to tne instruction at level 5. Once 
memos acconplisned, wne@€n tne working level dcescends to 
level 11, an asSignment ot ’2° cannot 6e made and as a 
result, tne assignment at level 12 will no longer pe forced 
by dynamic equivalence which esives the synthesizer a cnance 
to try otner assignments tnat will maintain deterninism of 
the algoritnm. 

Pseudo-assignment conditions and 1. <p 
eonditions can occur in the railure memory as 4a result or 
maryaetion of all but one of the failure memory cells in a 
row in the same manner that they occur in difference set 
Fesolution. Additionally, dynamic euuivalency and difference 


mereresoOlution can int@ract to cause failure memory entries 
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in the following manner. I[f a failure memory entry is made 
Pyeagitterence set resolution at any level wnicn is in tne 
Samer, couple=-class as a2 level vreviously assigned a pretix 
Hace and aif tne fallyre mnemory entry prevents the 
moor eument that will cause the instructions to become part 
Seeerne same state, then dynamic noneuquivealence will resvlt; 
theretore, an entry must be nade in tne failure nemory to 
mmamecete this condition. 
5S. Backup/Fixup 

Tne discussion of backup and fixup conditions nas 
Peeme Saved until last. The basic idea behind constructine 
Moemersynthesizer is to provide as much information as 
possible to the search mechanism, and thereby direct tre 
label assignment with a minimal number of retries. With tnis 
in mind backup and fixup become last resorts. 

The fixup Operation attempts to resolve 
Domeaererminism by incrementing the lapel at tne current 
working level wnen a contradiction occurs. [If tne newly 
incremented lapel is not a legal assignment or does not 
eorrect t1© contradiction, tnen backup must be initiated. 
fhe fixup operation cannot be attempted if tne assignment at 
the current workine level is forced or if the assienment 
created a new state. In eitner of tnese cases, a fixup 
Operation would leave nondeterminism in the algoritaom. 

If afixup fails, or cannot be attempted, backup is 


moeecated. Beckup must be iniltiateac from tne current working 
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level wnen any level is discovered wnich contains one of 


these conditions: 


1) Tne label assignment is forced and the tailure memory 
CoMmcim@gosoDOndras =§ tO Theat levet-dand tabel is valid. 
2) Tne tanel assignment causes 2 contradiction and 
represents a new state, or 
3) There is no free state avaitable for the instruction 
MecmparuLcular evel, and ait entries in tne failure 

Memory ©£Ow at that level are velid. 
Tne backup begins at tne current working level regardless cf 
which level triggered the mechaniSm, and continues until 
none of the three conditions given above are present. At 
Pipsueeeecvel a fixup operation is ettenpted and tne searcn 
begins anew. Any entries into tne failure memory which were 
sauceames CY levels esreater tnan or Gqual to tne new current 
working level are invalidated by resetting the failure 
memory entries to (@,@). Additionallv, any assignments are 
meeeowea along With their side-effects, such 4S annotatiors 
on forced assignments and new states. It backup causes the 
workine level to be decremented to zero, a free state 15 
weer or the use of the first instruction needing more 


Bpaves than initially allotted as tne lower bound. 
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pemeerrOBLSM SPECIFICATION 

The program synthesizer expects a set of triples where 
eoeeeetriple iS am instruction, a condition, and an 
ist riction. Biermann 2] nas snown tnat conditions 
mreovertently or purposely omitted by tne user nay  ~0»be 
inserted into a trace. The algorithm for insertion or 
comartions Collects the set of atoms seen on the transitions 
onan PiSGruectl OM. A waLonm 1S an entity which nas a velue 
of either “true” or ‘false’. A condition is composed by 
mereeeed: Conjunction and disjunction operations on atoms. For 
example, an atom may b©® “°c C= ¢", but a condition may te “ec 
<=6@ and a= 4°. A set of minterms is computed from the set 
Soamteroms and One of the minterms is inserted after each 
oeeuerence of that instruction in tne trace. If ia,o} is a 
wemeeor atoms, then tne set ot Minterms We be 
memes, 0t,1a,—-d},inma,~d}} where - stands tor logical 
negation. It nas be€n snown in reference [16] tnat only one 
of the mninterms can be ee tor each occurrence of a 
Paamion tliom £rom any single instruction. 

eae problem with the algoritnm is that it is incapabdtie 
of inserting conditions if the user nas failed to supply ary 
mies atteCr a particular instruction. For example, if the 


user should specify instruction [1 followed by instruction 
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ward) One part of tHe trace and instruction I1 followed oy 
Mommy another part Ofethe trace, but tne user failsS to 
paaeae a Condition atter Eitner occurrence of I[1, then tne 
aeeeor,y tom will be unable to eenerate a condition for ll. It 
is assumed that [1 does not appear witn an atom elsewnere in 
Moomerrace. The synthesizer will torce two stetes for I1 to 
resolve any once ten: This mechanism is ful hy 
peapmecrpeed in Section II. If conditions nad ceen supplied in 
the above example, the difference in tne two programs would 
be tne nuneter of states asslened to instruction I1. Figure 
26 snows a partial computation without exolicitly expressed 
conditions along witn tne associated syntnesized program 
tragment. Figure 25 assumes that [1 does not appear 
elsewhere in tne trace. Figure 27 is a representation of tne 
MemeeDartidl computation except that tne conditions cl and 
c2 have been explicitly expressed. Tne computations in botn 
meee ss aro the same, and e@acn program fragment will 
womeeetly execute €1ther trace; therefore, the programs must 
femecagurvalent programs with respect to program obenavior. 
However the program in Fieure 27 is minimal in that it 
contains fewer states because the user explicitly supplied 


tne conditions. 
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Example Computation 


OOO OOD 


Syntnesized Program 


Bieure <6. Computation witnout Explicit Conditions 
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Example Computation 
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Synthesized Program 


meemiromar,.  COMDULALION with sxplicit Conditions 


We intend to show that tnere are mechanisms which can be 
mecommmero aulomdticalliy generate tne necessary conditions for 
the correct synthesis of an algoritrm voroduced dv an #xample 
SemouretiOon Without tne user explicitly defining them. The 
problem may be described as follows. Given an example 
Sonpirtation without explicitiy defined conditions, infer 
EaoOSe Conditions necessary to eontrol tne flow! of 
computation in a manner such that the synthesized program 


will demonstrate tne bdenavior desired by tne user. In order 
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Memeracyl]itate the Solution to the problem, a condition will 


be viewed as a function tnat returns a value of “trve’ or 
“talse” when callei rather than a tlogical operation on 
atonic boolean entities. Lie apmombenacan Then be Chouent oF 
pemeconStructineg a function. 

iemyowtittlS tniormation is availtabie to the current 
Herston of the synthesizer when the user provides only a 
sequence of instructions. Certainly not enouzn te gererate 
ieemat programs as described in Figure 27. This lea us to 
search for other sources of infrormation tnat would allow us 
momconstruct the necéssary conditions. We soon realized that 
the instructions issued by the user do not exist ina 
vacuum. Tnese instructions manipulate data. If tne entire 
Pomouver nemory, including registers, is viewed as tne 
domain of interest, then execution of an instruction always 
@epeanees this state. Intuitively, tne domain also reflects 


YGLcura 6 


fi: 


meomeeeasOn that the user decided to ex@€cute a ~p 
Meer nuction. A search of e space of this size in order to 
mouommene the reason is impractical; however, observing only 
those data elements affected by the sequence of instructions 
Bammmerpen DE Quite practical and can significantly reduce 
the search space. 

We cnose tne text editing domain as tne domain of 
interest sinc® we felt tnat it would be sufficiently 


Mmeeresting tO warrant application of synthesis techniques. 


This domain was selected because, firs t. tecnoniaques 


Tie 








developed in this iomain may be general enougn for extension 
mye Owner adomains, secondly. the world tor tnis domain cen 
be described as tne set of all cnaracters contained in a 
pemeiiar text tile which makes tne world tinite, and 
memalily, tne instruction set is Small enough to te 
manageable. 

Although our Dritakry “Tresearcpa 1S directed toward 
Suloyaine techniques to epply to automatic CONC 9On 
peneration, we feel that the syntnesizer could be a powerful 
Memeyemeeaitor and conld provide some usetul features not 
normally seen in conventional text editors. Fxtended 
features could include tne abdility to cavitalize tne first 
metres Of every sentence, the ability to capitalize all 
small letters in tne text, the anility to identify Be ee 
mmomeoeriorm some operation before, arter or on it , or any 
combination of these editing actions. 

Tne working nypotnesis is to nave tne user process the 
Pexuetile in 2 normal manner and have tne syntnesizer infer 
memerorzran trom nis actions. Two requirements were levied 
modes Vser. The tirst requirem@nt on tne user is that ne 
must inform the syntnesizer wnen ne desires to nave a 
Bmoeran eenerated so tnat tne syntnesizer Can rerin 
monitoring tne user’s actions. A great deal of time was 
spent trying to figure out metnods tnat allowed one eceneral 
mechanism to be used to monitor tne user’s actions and the 


Beswerting changes in the text file. Since we could not 
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produce such a mecnanism, a second reauirement was levied on 
meemeerser. THiS™ requirement recognizes eae dasic distinction 
Seamween two different aspects of text editine: context tree 
supstitutions, and context sensitive substitutions. we 
memmne a CONtexrt free @€nvironment to be one in whicn tne 
Character to o©® operated upon 15 not dependent on characters 
mummies tt. Capitalizing all occurrences ot small letters is 
an example of a context free operation. A context sensitive 
eee on is defined as an operation in wnich the action to 
femeoer1 Ormed on a character or sequence of characters 
depends upon otner characters around tne main character cf 
mrrerest. Capitalizine the tirst letter of every sentence is 
a context sensitive operation. Condition inference in a 
context sensitive environment is innerently more difficult 
than in a context free environment in that the condition 
must be constructed from @vents wnhicn ré€quire a look-anead 
Sapaortity not inherent in the synthesizer. The user will te 
mreemLo Switch from environment to environment at his 
Souvenience. The synthesizer will create program seements 
Prom €ach environment whicn can be used to construct a 


Pomoeeve program by a post-processor. 


B. DESIGN FOR A CONTEXT FREE SNVIRONMENT 
1. Overview 
Programs that operate on a single entity can te 
constructed by the synthesizer. Fieure 2& snows tne 


somstruction of Geeowoscram trom dad trace Intended f0 
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communicate that the letter 4 should be capitalized 
meerTever it appears in the text file. Tne column labelled 
“trace” contains triples of tne form instruction, condition, 
micsrwetion. B iS the start instruction, RF is tne move rignt 
maeeenietion, C 16g the capitalize or change instruction and § 
memetne Stop instruction, respectively. The conditions for 
m~mm~mpmeace are tne characters seen in tne text file prior to 
muemeexecution of the second instructicn in Gach triple. The 
special condition “#® is tne null cordition, and is always 
inserted after the start instruction. 

Tne generated progran will correctly execute the 
meteewrnat Was used to construct it. and by examination of 
the program it can be Snown that the program will convert 
all a@’s to D’s in a text file consisting of tne cnaracters 
pee GO, 2, F anit G. There are no arcs available for otner 
mommmemenrs in tne cnaracteér set. In order to generate a 
Peciewam tO perform the same function on an arbitrary text 
mewn e wser would be forced to give an erample of tne 
desired transition for every character in tne character set. 

SmcemlG. 1S sdesirable to relieve the user of tne 
Chore of voroviding an inordinate number ot examples in order 
mmcomoretely specity tne function, a method is réyuired 
Porte utilizes a few examples of the types of conditions that 
Meemmeroranpear on tne arcs to generalize tne conditions into 
a more compact and complete torm. If a generalization can be 


mound, the multiple arcs may be replaced with a more general 








condition and, thererore, correct vrogramS can be created 
mmm sewer Sxamples. However the combination of arcs between 
modes TUST be accomplished so that determinism is maintained 
or the synthesizer will not create a mimimum state macnine 
meee Of performing tne desired function. Tnat means tnat 
the generalization trecnniuue must ode able to nandle 
molmerrcts properiy. The arcs in Fisure 28 tnat originate at 
State R and terminate at state R avpear to consist ofr 
elenents from tne cadital letters and small letters. The 
eeneralization of {x} x @€ capital letrers} U iz! z¢€@ snall 
letters} would appear to 62 a reasonable replacement for all 
of tne R to R arcs. If tnis generalization was nade a 
conflict wouli result because the letter “d° is also an 
element of tne {z1 z € small letters}. 


Trace Synthesized program 
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The preprocessor is designed to accumulate knowlece2zs 
iaomeeui2e traces 1t iS provided, then use tne Knowledge to 
mamertrruct meaningtul conditions. The pvreprocessor scans the 


[mre trace looKINe at tne instructions and characters that 
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muoee seen before the instructions. This pnrass extracts pairs 
Saetstructiors from the trace. Tne trace in Figure 22 would 
Meeemthe instruction pairs (F.R), (RR), (RC) and (C,R) 
Seuracted. Attached to €acn of these pairs is the set cf 
Seeeece pers tTiat were seen between tne pair. The preprocessor 
then analyzes tne information to determine it a 
eeneralization can be made from tne set of characters 
meee ated with @acn instruction pair. 

DicmnaturdmemwadiVrsion = mentioned aoove ditows tue 
preprocessor tc be divided into two modules. The first 
feemeee perzorms the scanning function wnile tne second 
module analyzes tne information and anplies a neuristic to 
mrovide rie Most eeneral Condition bess i ble: Tne 
implementation of the preprocessor will be discussed later, 
Berveperore it can be discussed an explanation of tne date 
meommerures required ty the preprocessor is needed. 

eee reprocesscr Data Structures 

MOmect pl la yestne procvlem we  detrine two types of 
mesvructions in this domain. Instructions that specify the 
current location Cee Mmberest. sdheu (Cunsor 9900 Si tion igs 
instructiors. Instructions that change tne state of the 
mcomeeare datd Manipulation instructions. Tne preprocessor 
accepts as input a sequence or tnstructions and an 
maerrerateda sequence of characters. Tne first instruction in 
meeminstruction sequence is always the Start instruction 


which does not nave a Character associated with it. The last 








mernctron in the seguence is alwavs a nalt instruction. 
Every action pertormed by the user is captured and appended 
meer e Lost ruction seqwemcemiist. The cnaracter seauence is 
ereated in narmony with tne instruction seauence. In the 
femme scent State the cursor will indicate a certain position 
in the text. When the user performs some action such 4s move 
Biemcursor rignt, a monitor picks up tne value in tne old 
mereereon dnd associates thet value with the instruction 
executed by tne user. For examols, assume a user nas a text 
meme Lower Case Letters that ne wants to change to all 
upper case letters. Tne user initiates the synthesizer then 
Bmgreecasesacross the line of text changing lower cese letters 
momeunpper case® letters. For the purpose of this example, 
assume tne line of text is ‘change lower case to upper 
case. AS the user moves across the line making 
mimeo unlons, the condition monitor captures the actions 
Som ormea ani tne characters seen. Th2 example tine would 
miemommarmeinstruction sequence of (2, C, Rk, C, R, C, FR, (OC, 
wee, C, SS). Tne associated cnaracter sequence wovld pe; (c, 
Meer, «63, «60, )6CUlw fw, )«6@,lC CU). CUTe 6 Cand SC RCrnsthe 
instruction sequence are the capitalize and move rignt 
mastruction, respectively. Note that TRe Cant tabize 
instruction does not repvosition the cursor and wnen tne user 
mmeccmeyne Cursor to the rignt, tne result of tne cepitalize 


instruction is associated with tne move. 
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PMComment dash ruc LUre Needed by the preprocessor is 
the ASCII] vector. The ASCII vector is a 128-byte linear 
ema with indices’ nunbdered 4 throuen 127. Bacn pvte in the 
array is reterenced by tne decimal value or a particulér 
ASCII character. For example, tn2 array eiement reserved for 
tne ASCII cnaracter °@° is indexed by 48 decimal. Tne array 
element reserved for the ASCII character “’a° is indexed dy 
memmerecimal.s The vector defines 4 partition of tne ASCII 
enaracter set by usSine the following technique. The ASCII 


Character set has been divided into Signt mutually exclusive 


SuoOSetS. 
pe eSet Gapiva letters 
Subset 1 Smale retters 
mPsSet << Numbers 
Subset 3 Space character <sp> 
Subset 4 Symbols 
Supset 5 Pine Yuet on 
subset 5 Aritnmetic operators 
Subset 7 GOMGmO1 che nacters 


meeeesubset name is entered into tne ASCII vector at eécn 
Peni by converting tne ASCII character to its deciral 
Prevalent and using tnat value as tne array index. The 


default partition is shown in Figure 29. 





Fieure 29. ASCII Vector 

Tne cnaracter set nNierarcny is detined by tne tree 
Eeeacture in Figure $88. The tree is related to the ASCII 
menor cOorough the cheracter subset names Contained on edcn 
node one level above tne leaf nodes. For tne default 
nierarcny snown in Figure 30, a zero would be enterea in tne 
mista vector for ali capital letters, and a 1 would 0obe 
maemeas fOr ali small letters. If a different partition of 
the character set is required the user can modify the 
hierarcny or create nis own. An example will be 2Ziven to 
explain Now the modification mav be daccomplisned. Assume a2 
partition is desired wnere tne vowels are isolated into a 
motes sune furtner that tne tne vowels are to be subdivided 
into capital vowels and small vowels. The nNierarchy would be 
modified by placing a son called “vowels” on the alpnenetic 
node. Attach to the new node two sons, ralled “Can-vowels’ 
and “Small-vowels”, with arcs to the anvropriat® characters. 
meesoel the nierarcny so that sibling relations are numbered 
in increasing order. Finally, initialize tre ASCII vector 
meenetne new labelling. All of the modifications can oe done 
by the system when the user calls for the modification The 


Moditied nierarcny is snown in Figure S51. 
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Ficure 30. Default Hierarchy 
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Liem GumeecshnuUcC Ture used by the preprocessor is 
Mm@eeetransition table. The transition téble contains tne 
Knowledge gleaned from scanning the instruction seuyuence and 
maemo nardcter sequence created oy tne monitor. Figure 32 
Sows the format Of the transition table. The transition 
MmugeetS anh array of records with Sacn record conteining 
mieernation On a transition. In the table, 11 ana I2 are 
instructions wnere [2 directly tollows [1 in at least one 
place in tne instruction sequence. “Active-sets’ is a field 
that contains intormation on sets or cnaracters tnat nave 
Beenecobpserved Oy tne monitor on tne transition fror I1 to 
I2. The fields “Set-1° tnrougn “Set-n’ contain tne vaiue for 
pemeeoane, the count of the elements from tne set associated 
meee transition ani a pointer to a linked list of tne 
elements. The records that would be created tor the trace 
given in Figure 22 would be associated witn tne transitions 


Reema) 1 tO 2, 2 to C, C to R and R to S. 


nap eap@aeaanp GP a= awe 2 GP GF ap GP Ge eran cp SP SP cep SP SP SP eee a Sa 6 em cep cep 62 cp SS SD ee ee ee ee ee SP ae Se we SD Se ee See 6 ee 
per @aPae ee ee ee eae ee & ae 6 22 2S 2S ae ee 26a SP 2 Pe 22 Se Ge ae ee 2 = Ss wee SS Sp OD ee SP SP SP SP SP 2 Se SP SS 2 == a= 


Figure 32. Format of tne Transition Table 


4. Implementation 
Pew mecontemt free preprocessor consist of two main 


modules; the Scanner and tne insertion modules. Anotner 


mmportant module not part of the preprocessor is tne user 
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Mmembeuer., (Ne MOnltoOr Batners tne actions of tne vser and 
creates two arrays. One array contains tne seyuence oft 
mMmeernoctlons tne user provided and tne otrner contains 
information of what was true before an instruction was 
erecwred. The information tnat is gatnered is tnen paesed to 
meme ppropriate preprocessor. 

Tone example instruction and character seyuences 
Peyemeein Fieur® 55 will be the example used to explain tne 
mechanism ot tne preprocessor. Figure $6 is illustrative or 
fMumeorrection of actions tnoat were performed by sone user. 
The user’s goal is: Change ail lower case letters in a text 
tile into upper case letters. The user nas activated tne 
mmr ron NOnitor, positioned the cursor at tne beginning of 
pemermes ot text and moved right along tne line, changine the 
Mowemmecase Letters to upper case wnenever one appeared above 
the cursor. Figure 35 is an example of output from tre 
monitor assuming tne line tne user processed was Tie 
munbers 1, 2, 3, 5S, 7 ARE prime. . Tne first column in 
Figure 33 is tne character array. [It contains tne cnaracter 
Mitemeeetie Cursor prior to execution of tne inetrrction in 
momumme two. GColumn two iS a trace of tne actions performed 
by the user. The “R represents the “move cursor rignt™ 
instruction and tne “C’ represents a cnange witnout cursor 
Mmposttion instruction. Figure $25 can be read as: The 
emaracter in Column on@€ was observed and tne instruction in 


column two was erecuted. 


So 





Pace < Cnaracter ase Tic Clon 


vector vector 
i T R 
2 n C 
o H R 
4 a C 
5 f R 
§ <s p> R 
7 n C 
ee i) R 
a ; R 
24 <sp> R 
25 2 R 
55 A R 
Or R R 
58 E R 
48 © C 
ag E R 


Mere woo. MONLtor Output 

Jocmoeane module Of» tne preprocessor is activated 
wnen tne user indicates the representative example is 
conplete. Let “inst-index” pe an index tor tne instruction 
erty that 15 initialized to 1. The tirst step is to create 
feeranisition from tne start instruction to the first 
mnostruction in the instruction array and add the trarsition 
Memeeeee transition table. Tnis transition will indicate the 
MPeeiunine of the program and will transition to the first 
Meetruction provided on a nul! condition. Tne module then 
moves down the instruction array creatine other transitions 


and adding tnem to tne transition table. Duplicate 
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meemtsi tions will not appear in tne table. A transition is 
mepumed as 2 pair (12,12), I1 and I2 are instructions and [2 
moemeows IL within tne instruction array. The instruction 
array in Figure 33 yields tranSitions (R,C), (CR). (R,R). 

me transitions are constructed by indexing through 
Meeninstruction array. Tne instruction at insStr-index aénda 
Mmieeeeandex + 1 form a transition. The transition is tne 
ferupermecoedinst tne transition table. [f a matcn occurs, tne 
memracter in the character array 3t inst~index +1i15 
extracted and its ASCII value is used to index into tne 
meee) vector. The value stered in tne ASCII vector is used 
Sosean exponent for two and stored in a temporary variable. &A 
Oisteeby Dit logical OR is pertorned betwe€n the tenporary 
variable and tne Active-sets variable for tne transition and 
Mtemmeesilt if Stored in Active-se€ts. Active-sets contains 
the information of every set trom the partition that nas 
elements seen on tne transition. The operation described 
muoremaltocates one bit tor eacn set in the partition. If 
Active-sets equals 1 then bit one of Active-sets is al 
Sremarying at least one element of set 1 nas been seen on 
moiremeeuransition. A two would signify tnat some element of 
set two nad been seen and a three would signify that some 
element of set one and some element of set two nad been 
seen. 

In the transition table are tields for eacn set that 


nas been deternineat to be active for tne transition. Won io 


oi 





meme Of the set fields there are tnree subfields, tne first 
is the set name, the second iS 4a count of the elements seen 
Momeeetnae set and tne last is a pointer to tne start of a 
Surcularly linked 115t containing the elements used fror tne 
set. The value that was opntained from tne ASCII vector is 
ieseeeeas 2 Set name and matcned against eacn or tne set 
tields” set name. It the set name matches an entry the 
Smenecver at inst-index + 11s added to tne linked list in 
lexicographical order if not already on tne 1l1iSt and tne 
emt is incremented by one. [f a match does not occur on 
meemeset NAaNe a new set field is created and given tne nane 
that was obtained from the ASCII vector, tne count is set to 
amenmmana the character is put on tne list. 

When tne scan module reaches the end of the input, 
Mecmeeransition table contains an Ertry for each transition 
that was seen. ach transition is associated with all tne 
sets that nad elements seen with tne transition. Finally 
Saemmmeurensition is associated with tne actual elements 
througn tne Linked list tor each set. The information is 
tnen passed to tne insertion module for analysis. Figure 34 
Shows the completed transition tadle and tne linked list of 
eeremients tor eacn set. 

Once a completed transition table has been created, 
memos 15 passed to the insertion module. The insertion 
module processes the information in the transition tatle and 


assigns a condition for eacn transition. 
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the linked list neaded by the same symbol. 


NOTE: The notation <1>, 


Completed Transition Table 


Figure 34. 





The Reb ive oie LS entries provide an etfticient 
frmenanism tor recognizing potentie!l conflicts on emanating 
ares. Pertorming a tit by bit AND on the Active-sets entries 
Pec Have a common originating intruction yields the source 
fmcomrlicts. The bit positions tnat are on (bit equals 1) 
are the set (or sets) tnat nave nad elements on multiple 
transitions. For example, let (11,12) and (11,13) pve entries 
in the transition table with Active-sets value or five (¢1¢1 
binary) and three (@¥1l binary) respectively. Let © equal 
toe result of tne bit by bit AND of tne Active-sets values 
given above (i.e. WW@1). QO indicates that there is a 
conflict between tne transition (11,12) ana the transition 
(11,13). Furthermore, Q indicates that the set causing tne 
conflict is labelled zero in tne nierarcny of Figure S@ 
Semause the on bit is in tne rignt most position whicn 
corresponds to two raised to tne zero exponent. Using the 
Peery FO Enter the hierarcny, it can be determined that 
capital letters were seen on both transitions. Once all the 
SemermretsS ror transitions witn the same originating 
[eemme ri: ON ere Known, the contlicts must de resolved oefore 
an assignment of conditions can be made. 

Pee Cena neenoe exanpie€ Sliven above, assume That erent 
Capital letters were seen on transition (11,12) and four 
capital letters were seen on tne transition (11,13). A 
Baetial condition can be constructed tor the transition 


(311,12) as a set difference petween tne set of capital 
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Petters and the actual elements seen on the transition 
Weeels). The partial condition tor tne (J1,13) transition 
becomes the set of capital letters tnat were actually seen 
Memos this transition. Tne initial conditions for these 
Peamsitions become the union of tne sets indicatea in 
Active~sets as not boeing in contlict and tne sets created by 
tne resolution of conticts. Tnerefore, tne condition for 
(piyi2) is ({ x } x @€ capital letters} - {xix e capital 
letters on otner transitions}) U {xix @ numeric}, anda tne 
eomaation for (11,13) becomes { z | z e (factual capital 
Jetters seen} U {smail lLetters})}. In tnis example, it was 
assumed tnat tne sets, numeric and small letters, were an 
appropriates veneralization for the transition. In practice 
it cannot be done without consideration of the number of 
elements that have been seen from the set on the transition. 
If the count field for the set exceeds a tnresnoid value for 
[ee@eset, the generalization nay be made, otnerwise tne 
elements tnemselves become the partial condition for tne 
meaasa tion. 

a2 Ler GemcOnuatlOon Ods. DEN, CONSTructed- “for 4a 
PeaosstiOns a final strong generalization technique is 
employed. Tne Active-sets value for the transition again 
Sipe. es the Starting point for this tecnnique. Notice 
adjacent bits in Active~sets correspond to adjacent nodes in 
meme erarchy, Therefore, 4 cneck is mad@ ot the Active-sets 


to see itr it nas adjacent oits with a value or one. If it 
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does then a generalization May be attempted. Assume tne 
condition (({capital letters} - {A E I O U}) U {smalt 
letters} U {numeric}) nas been constructed LOT some 
maomelt.ong. The Retave-sets value for this transition must 
be seven (@111 binary). With tne default nierarcny in Figure 
Seema Seneralization to Alphabetic and tnen to Alpna-numeric 
Menta be attempted. Notice tnat a eeneralization to 
Alpna~numeric would tail because of 2a contlict with anotner 
transition. Intuitively ({alpna-numeric} - {A, E, I, 0, U}) 
Mowers De a correct choice for the condition for tnis 
MmemmcihtbLon. A general procedure tor tne construction of 
Peneralized conditions is given below. 

A set of nodes Y = ay, Pe eee Yn i AS 
generalizable to a nod= K if the set of node Y form a 
conplete and Sxnaustive set of leaves to tne subtree rooted 
at X. Further, a set of nodes Z = {z, , Z, 9 sees 2m } ois 
Pefemalizable to the setvW.= iw, , Wi, .-- oWit, J < my where 


SeeemmweiS ca generalization of a subset Z. 


Meeseescondition = } Ul U ... Umr 
Weehemse 2 — OG; 4-1 = 1,0 
where qi C 2 (q; possibly null) 


" 


THEN 
iow COnd i reneiSeset tO go) U 
isugn 
: wnere Wis tne smallest set 
W = iw, ? Was eee 9 w: } 


- J 
sucn tnat W generalizes 12,» Z sees dented Ak 


2 9 
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C. DESIGN FOR A CONTEXT SENSITIVE ENVIRONMENT 
Be) OVS Tv eine 

Condition generation am 1 TNe  COnTeEKT “sensitive 
Soyeenonment iS a more gdifticult task tnan in tne context 
Meee environment. This difficulty arises from the scope or 
moowledee reyuired to make decisions on wnat 2 condition is 
to oe. The conditions tnemselves are more complex because 
moeemeaepend not only on tnh@ cnaracter tnat 1s being seen, 
Dut also depend on characters that precede and follow the 
Current character under consideration. The following example 
Will be uSed to illustrate the difriculties and our solution 
to this problem. Assume a user wisnes to capitalize all 
occurrences of tne word “tine” in some text file. Also 
asoume that the word occurs at tne bdDeginnine, at the end, 
and in the middle of sentences in tne text file. Tne 
mest. On 1S how to construct a program that pertorms tne 
desired function given only tne actions tne user performs 4s 
Seon ple of the required program. 

Toe assumption about the position of the word “time” 
momeone text tile implies tnat tne requested action needs to 
be accomplished on strings that Mave very dittrerent 
Cnaracteristics. Certainly, botn “time” ana “Time” shoula be 
capitalized as snould “time,” , “time?” and “time<sp>”. On 
the other nand the string “time” should not pbe capitalized 


wnen it occurs within a word like “sometime” or ‘timely’. 
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Vivo odewdtewmmeproecran that ~bShaves as cescribed 
peeve Must be able to recognize an occurrence of tne string 
wameesome Variation of tne string. The totality or this 
Mamornatlion must be elued togetner to provide a single 
memorrson that is descriptive of what tne surroundine 
environment must be lixe before the action is performed. The 
Mmieeredtion is that tne condition itself must be asle to 
DerLormn cnecking ani look~-anead. [n other words, the 
Somer ion for the transition to the operation must in fact 
be a proceiure which responds “true” whenever the string ot 
Mevercst 15 recognized. Assume for tne present tnat te 
String ot interest can be discerned trom tne user’s actions, 
(a hard problem by itself, see Angluin [{19]) one must wonder 
mmeesico a Droc€dure can be constructed and tnen inserted 
miomune generated program wnicn performs the tunction of a 
eoraeerrom On some transition in tne program. Figure 3$& snows 
a procedure which recognizes the word ‘time’. Note the 
MmeeenessS Of the procecure in that it distinguisnes between 
the dittfering occurrences ot “time” as mentione2? above. 
Pigure 55 points out that tne problem is not just generating 
Mammoocedaure as da condition but also gen@€ratineg conditions 
memomeme tne procedure that is to be the overall condition. 
Tne arcs labeled “T vt” and “<SP> v {tpunctuation}” snould 
be noted with interest because they provide tne robustness 
meee cOnNdition procedure needs. The discovery of are labels 


mometne condition procedure wili be discussed next. 
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“(<sp> v {Punc.}) 
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‘Operation: 


Fieure 35. Condition for time and Time 
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Aer Lene Ne nt ah On 

Toe monitoring of user actions provides tre 
mructlon and CoOdract@€r sequence in tne same nanner as 
done in the context free mode. A consideration was given to 
require nore information be provided by tne monitor, 
mowever, the notion was discarded because it would require 
the user to be aware of the functioning of tne preprocessor. 
emmring tne user to provide information to tne system 
would betray our goal for tne system. The user should only 
Mm—eerewduared tO imltidate the system and tnen pertforn editing 
womra the SySt2m was not actively monitorine his actions. we 
feel the requirement of specifying whetner the user wants to 
memmonm cOnvexXt free or context sensitive operations is tne 
maximun that snould be asked. If it were feasible to 
Reeoenize the difference between tne two modes from tne 
user’s actions alone, this limitation would te also removed. 

Gowen Seonly Sine Instruction sequence, tne character 
Sequence, and the intrornation of a context Sensitive 
Cnvironment, Hue™ 72rst asSsiennendt of tae context sensitive 
Peemenpocesoor is to discern the strine ot characters upon 
wnhicn some operation is to be performed. This is a pattern 
recognition problem ot consideraple ditticulty. Angluin [19] 
provides the following theorem, ‘There is an ettective 
Meecedure which, when given a sample 5 as input, outputs a 
pattern p which is descriptive ofr S;'. The sample S is a 


mirset) Of to2& set of all strings over tne alphabet of tne 
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language. The etftective proceiure is computationally 
eeeens ve dana not ~imiplenentationally desirable for our 
eyerem. ine procedure is an Enumeration tecnnique on 
mBeemeerns with a lengtn less then tne snortest example in the 
~areeem Set 5S. Hatn Of the Enumerated patterns is tested to 
determine if it 18 descriptive otf tne entire set S$. The 
ioeeest Pattern that is descriptive of S$ is the most 
eee iC pattern for the set. Clearly, as tne léngetn of tne 
of the sample grows, the number of enumerated patterns will 
grow exponentially. Angluin [19] states, In tne general 
case, the test performed on the patterns is an NP-complete 
problem. . The test she is referring to is the check to see 
meee enumerated pattern is descriptive of $. 

For implementation purposes, we need a mecnanism 
oat tidlls well Short of tne exponential benavior of tne 
effective procedure mentioned above. The text editine domain 
mesmerwo tYpesS Of instructions for tne purpose ot tnis paper. 
The Meat tyYDe Of Instruction will be called cursor 
Boemr1Onine® instructions while tne second type will te 
Saeeeo data manipulating instructions. Assuming the text 
file is to be represented as a linear array, only one cursor 
meemeneeon instruction need concern us. All cursor positioning 
commands such as move left, move up or move down can be 
memmesented as move right instructions. Data manipulation 
instructions operate on one character and do not reposition 


mies CUrSOT. 
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PACmmeottOtmewe setave adopted for determining the 
Peeoe Of interest and the context of tne string 18 based on 
tne above definition of the types of instructiors available 
mame text @E@diting dondin. Tne preprocessor scans the 
mis cnuction sequence looking tor an occurrence of a data 
manipulation instruction. The cCnaracter associated with tnis 
Meeerection is then taken as the tirst cnaracter of tne 
Peane Of interest. Other cnaracters are added to the strine 
Peecontvinuing the scan until multiple occurrences of cursor 
Me@ioryionine instructions are encountered. A hypothesis is 
mermeeconsStruicted Consisting of three parts. The first part 
moeeeune berinning context. It is constructed from the 
Gharacters tnat preceded the string in the character 
amence. The second part is the string itself and tne final 
Demo iS the endine context constructed from the characters 
Been after the string. For engine@ring considerations, tne 
Domeer Of characters in the beginning and ending context 
Will be Limited to twenty characters. The probdability ot the 
pomeexy ECxcC@eding twenty cNndracters on odotn sides of the 
String in the text editing domain iS Small enouen to ignore. 

Once anypothnesis iS proposed it is set aside as an 
active nypothnesis and Scanning of the input continues. Otner 
Gases of data Manipulation instructions surrounded by cursor 
mosmutoning instructions will result in otner nypotnesis 
being constructed. As these hypothesis are added to the 


Eenve Ny DOTMESIS List they are checked for consistency and 
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it the new nypotnesis causes conflicts they are resolved by 
constructing anotner nypotnesis from the Court 1eting 
MeaporenesiS. [0 demonstrate this mechanism we present an 
Sromowe woicn Will idlustPate the generation of hypotheses 
Pure soluction into a condition function. Tne example used 
meee CONStTruction of the tunction whicn will recognize the 
Susine tine. 

Suppose the text file contained the tollowing 


sentences somewnere in tne file. 


Liceetine 15) two soclock. 

LtetlS time to ¢o to ted. 

Time the runner. 

Deg you run out of time? 
fee suppose the user nas specified the environment is to 
be context sensitive and nas begun to perform actions on tne 
file. The nonitor couli create tne following instruction and 


Pmcmere Fer Sequence fragments tron tne user moving tnrougn 


the text file and capitalizing these occurrences of “time. 


(RRRRCRCRCRCRRRR ...) 
Cine etna bmvies aS ce...) 


(RRRERRCRCRCRCRRRR ...) 
Citencut li limes to...) 


(RCRCRCRERRR ...) 
(TilmMeE tne ...) 


(... RRRRRRRRARRCRCRCRCRR) 
(... run out of tTilmMek?) 
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Dees e€xanple is not to imply the user must change all 
Seemrrences in tne text rile but ne snould provide enougn 
examples from the file to insure his desires are understood. 
iene User “has not supplied 4 distinguisning set of 
Somgees and 2n incorrect program is generated ne may add to 
tne set of examples. 

DesbulmewtheetiESt instruction S@quence until tne 
meet. data manipulation instruction results in tne string 
“time” being constructed. The resulting nypotnesis is tnat 
the string “time” is within tne context ot “The<sp>” and 
“<sp> iS two oclock.”. Tne nypotnesis may be viewed as tne 
following data structure. 

Hypotnesis 1: 

Begin context: The<sp> 
string: time 
Mice COMmMLextsr<SP>)S Uno OC lock. 
A second hypothesis would be generated for the next portion 
of tne instruction sequence as snown pelow. 

Fy TNSS1S. 2% 

Begin context: It is<sp> 
Striae: time 
EMGeCOnUem@t. §<Sp,tb0 20 to bed. 

A comparison of these two hypotneses indicates a 
disagreement between the contexts. The conflict is resolved 
by determining the longest beginning and ending context that 
wepee DEtTWeen the two hypotneses and generate a hypotnesis 
reflective of this agreenent. By working dDackward from tne 


miesemecraracter in the begin context for botn hypotneses, it 


fees possible to ascertain tnat the only character in 
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agreement is tne space. Working forward trom tne first 
emmpacver in the end Context tor botn nhypotneses, aeain only 
Character in agreement is the the space. A third nypothesis 
Beeeereuue new befin and end contexts is generated as follows: 
Hypotnesis 3: 
Becime CONbext: <S)D- 
String: time 
End context: <sp> 
This aypotnesis specifies tnat tne string ‘time’ 
[ete De preceded dni followed by a space. Note the test of 
the hypothesis implies tne user is allowed t0 Specify one 
Supume GUuring an example conputdation. It is also implied 
That there must be a begin and an end context for the 
string. Since it is possible to nave two nypotneses wnere 
Smee oretne context Strings do not @eree in any of tne 
Characters, a metnod must exist to provide tne appropriate 
cOmbex 
Whenever tne comparison between context or _ two 
Myeporeeeses Tesults in the null string, a disjunction is 
formed from the characters immediately next to the strine. 
Pommeexemple, the instruction sequence given above would give 
the hypothesis: 
Hypotnesis 4: 
Reem CONbeZt: Did you run outoof<s p> 
String: time 
Spdeconvext:) 7 
A comparison between hypothesis 5 and nypothnesis 4 


would result in tne null string for tne end context. Since 


there must be an end context, the disjuction of <sp> and ? 
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Momrormed and this become the end context for tne new 
mee tneSis. Generalization techniques that were mentioned in 
the section on context free environment are tnen applied in 
Zea bempt to reduc@mine Gnd context to tne most general 
context consistent with the data seen. The only alteration 
in the generalization scneme is tne towering of the 
memesonold values tor important sets. In this exampie, the 
threshold values for the punctuation set would be lowered to 
1 and the ending context would bdecome { x} x=spéce or xe 
{Punctuation}}. 

The final problem to be solved is tne recognition of 
Memeeet1ONS 1M a Strine. Examples of variations of a string 
are, Time” and “time”, or “enclosure” and “inclosure’. &s 
mentioned, if tne user intends to capitalize all occurrences 
of “time’, “Time” is to be included. Note these variations 
femeine String become tne conpound labels for tne arcs in 
Figure 35. The system includes a rule tnat enables the 
recognition of variations of strings provided tne user gives 
an example of th2 variation. The rule Simply states tnat tne 
String length will ode estadlisnea to be as long as th 
momeest String Encountered during processing. Again, using 
the example, the hypothesis ror “Time the runner.” would be: 

Hypotnesis 5: 

Ber inmecOutemts ... T 
Suri ie. ine 
End context: <sp>tne runner. 


It has been eStablisned by preceding user actions 


that tne string length for tne oypotnesis snould be 4. By 
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Meee nine the pattern in hypothesis 5 with the string trom 
Mmepounesis © it Cdn be determined tnat tne string in 
Hypothesis 5 Snould be expanded by inserting a “T° in tront 
of the string. Anotner nypotnesis iS then generated where 
tne String will be tne disjuction tetween tne strings “time” 
and “Tine”. Tne final nypotnesis from the example would then 
pe: 
Hypotnesis 6: 
Begin context: <sp> 
String: “tine” v “Time’ 
MeIcompext: 19%, X = Space or x € Punc.} 
Omcey This DBypognesis Mas been generated, 1t 1S tnen 
used to examine the input for negative examples that can 
strengtnen or weaken tne nypotnesis. Suppose tne input 
contained the fragment ... timely results... . Processing 
the input witn Hypdotnesis 6 would Show a matecn for tne 
Zemeiee Out the end context would not agrees therefore, tne 
nypotnesis will be strengthened Dy changing tne end context 
as snown below: 


Final Hypothesis: 
Begin context: <sp> 


Stine sn tame «or Time - 


End context: ix;x=Space v 
Meer Cs. ok 
x @€ small letters} 
After the input hasS been processed and a final 
hypothesis proposed, tne hypotnesis is used to construct a 
meeceaure such as shown in Figure $5. Tne tirst part of the 


Dreocedure to de constructed is the transitions for the 


PepeeONeeeGONbEX(. Lhe = states in the procedure are the 
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mist ructions in the instruction set, andthe arc labels 
@omsast Of the information in tne final hypotnesis. A start 
State iS placed in the procedure with an arc to 4 move rignt 
instruction (R). Since the procedure is a string matcn or 
Memeeamedd routine all states otner than the start stete 
will be move rient instructions. bach of the states will 
have two arcs exiting tnem. Tne Ta Deo, Ole tacse lwo Parcs 
Will be the negation of the eacn other. 

Tne construction is accomplisned by placing tne 
first CMharacter of tne begin context on tne exiting arc 
going to a new move right state. Tne otner arc 15 labeled 
Meet oe negation of the character and this are terminates 
at tne first move right state. Hach character of tne begin 
context creates anotner nove rignt state lfaveled as 
mentioned. 

Toes Sthing “trom tne hypotnesis 1s then used to 
Somplete the procedure that has been partially constructed. 
If the string is composed of disjunctions, tne cnaracters 
maemroea tO tOrm disjunctions. Bach ot the disjunctions are 
combined witn conjunctions. The final nypotnesis above 
provides a string ot “time” or “Time”. Tne conjunction of 
disjunctions will be formed as: 

Me ok (my ome) & te. ve) 
Upon reduction the string will be expressed as: 
GAieevette) Ge lo 6 om & “ae” 


Bach disjunction becomes a label on an arc to a new move 
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Meme State and tne n@eation becomes tne lapel on an are 
back to tne original move rignt state. 

Finally, tne end context is added in the same manner 
as the begin context. The first character pecomes the latrel 
Op tne last move rignt state created fron tne string and new 
states are added for eacn character in the end context. The 


result of these operations is displayed in Figure $5. 
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HV. CONCLUSIONS “AND RECOMMENDATIONS 


A. SYNTHESIZER 

Beem syayaesi Zeer ti hate nas oceen implemented for tnis 
thesis will produce programs from exampl& computations in a 
reasonable amount otf time. The system response for most of 
the traces was within 1¢ seconds on a Digital kquipment 
Corporation PDP-11/50 minicomputer. Tn® response time is a 
mmmemeron Of the lenetn of tne trace and tne nuncer of 
Ment ple occurrences of a particular instruction or set of 
mreractions in tae Pagal areOreT am, witn Ment pie 
occurrences of an instruction atftrectinge response time the 
most. AS Biermann [17] nas noted, tnis nas a nice 
Pop cation for progranming by example because most 
oeeor~thnms do not exhibit tne characteristic of having a 
large nunber of instances ofr tne same instruction. In other 
moose alNOSt all multiple occurrences of an instructicn in 
Emmet trace are indicative or 2 loop in the elgeoritnm. 

In all of tne t2st cases except tnose® tnat required a 
mBeaeematount Of DaCKups, Static processing accounted for at 
least half of tne total response time. Future modifications 
to the syntnesizer which would decrease tne total response 
time Could be directed toward designing the static 
processing stage more efficiently. Eowever, tne trade-off 


Gemween Stetic processing and dynamic processing must ode 
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merivein Derspective. Static processing is a linear function 
of the leneth or the trace, whereas dynamic processing, 
Since it is an e@numerative searen téecnniaque, iS an 
Perea tial functlon of@tme lenetn ofr tne trace, 

Another area which snould be considered is the dynamic 
MeyermeesSing stage. Tnere Exists a plethora of researcn 
moesti1ons within this area. The primary one being: Can more 
mieonnation be “eileaned from the input trace during static 
Meoeessife which .will decrease the search time tor dynamic 
processing? Difference sets and couple-classes provide some 
peomenrut mechanisms for deé@créeasing tne amount of searcn; 
however, lower bounds computations on the number ofr states 
meminened Dy the macnine otten increase the amount of search. 
Lower bounds are restrictive in nature. They are designed to 
Mmomecm tne final algorithm into a minimum state configuration 
ween in many CcaseS, causes extra search time. Relaxation 
Creeuoe Lower pounds cemputation will result in a final 
weeomonm which may not be expressed in a minimum number of 
Pea) OUT which will still oe deterministic. There mient 
be better methods of initially computing the nunber of 
Eemeemesewoiich would result in a closer estimate ot tne actual 
nunper of states required tor tne algoritnm. Obviously, the 
eosers the initial guess is to tne actual reyuirement, tne 
less backup incurred, and, therefore, the less search tite 


mean. red. 
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Since tne amount of searcn required is governed by tne 
failure memory entries, the more dense tne failure memory 
can te nade, tne more directed tne searen beromes. SO 
mmemer area for research is to cetTernine it more 
information exists in tne failure memory entries than is 
Guepentiy being used. How nucn information do tne structure 
factor and the tree state tactor proviie? Is there another 
factor whicn would be useful? 

Finally, a more general yuestion can be addressed. Thre 
moereriyine structure of tnis techhique is an enumerative 
Seamen. Can the technique be generalized to include otner 
algorithms wnich are enumerative in nature? What 
meomeerrcations to the failure memory are needed? How would 


Gere rence sets and couple-classes be rederined? 


Bs CONDITION PROCESSING 

MmiemeGcondition processor tront-end to the Synthesizer 
feeeves the user from worrying adout some of tne control 
StRructure considerations by aurronaticd lily een@ratineg 
Somer tlons. Anotaer addition which would increase tne pewer 
fume «6S¥YNtNeESizer 15 an automatic itoop veriablie genérator 
as discussed by Biermann [18]. Altnougn the text editing 
weemponnent Nas been used in this tnesis work, tne part of 
Peemcondition processor design which deals with a context 
free environment is general enougn that it could ve designed 


to operate in any domain. 


ve 





Condition generation in a context sensitive ervircenmert 
fomaenuch Harder problem further complicated by requisite 
pattern matcning and pattern generation. Before tnis type of 
condition generation can be generalized. mucn worfr has to ce 
done to increase the efficiency of pattern eeneration 
schemes. Angluin [19] nas snown a pattern generaticn scnene 
which is a polynomial time algorithm ror pattern generation 
Meroeeone variable, out tne domain we nave examined will 
require at least two variables. There is not a polynomial 
tine algorithm for pattern generation witn two varianles. 
Pemmmstic techniques will probably be necessary to provide 
methods of pattern generation whicn will be fast enough to 


Oemmsetul over a wide range of problems. 
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